Find Jobs
Hire Freelancers

Python power programmer needed to create a function to ingest and process data as a stream

$30-250 NZD

Lezárt
Kiadva ekkor: több mint 5 évvel ezelőtt

$30-250 NZD

Teljesítéskor fizetve
Looking for a python developer / data engineer should have experience ingesting and processing data as a stream demonstrable experience handling 2-3 GB of source data **knowledge of object oriented programming concepts, professional documentation methods and python lambda functions are a must Oracle VM box, linux Ubuntu 14.06.5 LTS, pycharm, Anaconda environment Data is available as TSV extracts from multiple sources in CDL. Data Engineer should be able to merge the TSV extracts by means of applying correct join techniques. As the data will be available in compressed format, data engineer should apply right techniques such as reading data in a streams rather than reading the entire uncompressed format of data - as it might not fit the entire memory. Hence optimal coding is expected. The merged data will be transformed and stored in a postgreSQL data base ([login to view URL]). The function should follow Object Oriented Paradigm with continuous integration and deployment in focus. Also version controlling is expected. Some remarks: - Each data snapshot can contain multiple headerless main data files in TSV format, with each file having a size of up to 2GB. Engineer should be able to read files as a stream while unpacking them, because they usually do not fit into RAM. - In addition to the main data files, each snapshot has a file with the header names and multiple lookup files that map the numeric IDs from the main data to Strings, comparable to a foreign key in an SQL DB. - Data should be read and transformed on a record by record base (stream or mini-batch processing). - Each combined and transformed record should be prepared for multiple data sinks, e.g. SQL query strings to write a record into a PostgreSQL, MS SQL. Engineer will create code for a write adapter for each data sink with a common interface so that the same function call can used to write into any of the specified data sinks. *** Code provided should be modular, reusable and well documented. Engineer needs to know how to build Python modules with classes, using OOP decomposition practices, inheritance (e.g. abstract classes). - Code should have Unit Tests, if appropriate - Code will be implemented as a Python AWS Lambda function. Engineer should be familiar with building Lambda functions and should ideally have a local development environment, setup for building and uploading Lambda functions.
Projektazonosító: 18026938

A projektről

2 ajánlat
Távolról teljesíthető projekt
Aktiválva: 5 évvel ezelőtt

Szeretne pénzt keresni?

A Freelancer oldalán történő árajánlatadás előnyei

Határozzon meg költségvetést és időkeretet
Kapja meg fizetését a munkáért
Vázolja ajánlatát
Ingyen regisztrálhat és adhat árajánlatot munkákra

Az ügyfélről

INDIA zászlója
faridabad, India
5,0
35
Tagság kezdete: márc. 9, 2017

Ügyfél-hitelesítés

Köszönjük! E-mailben elküldtük a linket, melyen átveheti ajándék egyenlegét.
E-mailje elküldése során valami hiba történt. Kérjük, próbálja újra.
Regisztrált Felhasználók Összes Közzétett Munka
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Előnézet betöltése
Hozzáférést adott a helymeghatározáshoz.
Belépési munkamenete lejárt, és kijelentkeztettük. Kérjük, lépjen be újra.