Title: Hadoop with Python and ETL Location: New York City, NY Duration: Long Term Contract Primary Goals : To perform data mapping and ETL tasks for loading data into specific databases from diverse sources like flat files, separated, Excel, XML, pdf, email, web services, etc. Loading from disparate data sets and reconciliation. Pre-processing using Hive, Python. Translate complex functional and technical requirements into detailed design. Perform analysis of vast data stores and uncover insights. Performance and code optimization. Maintain security and data privacy. Creating algorithms using Spark/Scala leveraging distributed computing To learn and develop skills for faster and efficient data on-boarding tools and techniques. Required Skills: Play a critical role in database design and development, data integration and ingestion, as well as designing ETL architectures using a variety of ETL tools and techniques. Plan and execute secure, good practice data integration strategies and approaches. Experience and interest in Big Data technologies [Hadoop / Spark]. Experience in at least one ETL tool (e.g. Informatica, Talend, DataStage) Strong programming skills (Python). Good understanding of Python OOPS framework - Extremely good in Python coding, strong in data transformations and data management using Python. Strong experience in Python, SQL, Hive and Shell Dos, Bash scripting. - provided by Dice Associated topics: .net, backend, c c++, c++, developer, perl, php, programming, software engineer, sw
* The salary listed in the header is an estimate based on salary data for similar jobs in the same area. Salary or compensation data found in the job description is accurate.