DATA ENGINEER SENIOR

Contacte al anunciante
DATA ENGINEER SENIOR The PNC Financial Services Group, Inc. has an opening for a Data Engineer Senior in Raleigh, NC. Within the Enterprise Data Management department, the position will be responsible for performing analytical tasks on vast amounts of structured and unstructured data to extract actionable business insights. Specific duties include: (i) ingesting large volumes of data from various external sources into Hadoop, and able to build data pipelines (ii) developing IT based Hadoop strategies for business data implementation, date acquisition, execution, reporting, and archive recovery; (iii) advising IT, business, applications, and operations counterparts to ensure data integrity and availability; (iv) developing the PoCs for business problems leveraging Hadoop eco system; and (v) educating business community users with Hadoop best practices and latest techniques to optimize the code. Master's degree in Technology, Analytics, Engineering, Mathematics, Information Technology, or Information Systems Management plus 3 years of experience conducting data engineering and analytics on large volumes (Petabytes) of data is required. Experience must include: (i) handling, manipulation and analysis of large datasets (multiple terabytes of data);(ii) leveraging Trifacta or Informatica for data preparation and wrangling activities; (iii) utilizing data query tools including HQL, SQL, R, and Python to manipulate, analyze and interpret data; (iv) writing and implementing code/software to clean and transform multiple terabytes of unstructured data sets with numerical and textual data; (v) building data pipelines using pySpark, to prepare the consumable datasets for data analytics and reporting; (vi) integrating with external data sources to identify interesting and relevant trends; (vii) programmatically extracting data from a Hadoop database and transforming the data into presentable form such as an ROC curve, map, or Tableau visualization; (viii) ingesting real time data from Kafka streaming platform into Hadoop and preparing consumable datasets; (ix) converting existing complex SQL code from Teradata/SAS/Oracle into Hive Query language/pySpark; (x) optimizing and troubleshooting the Yarn/Impala applications; (xi) providing training to hundreds of business use
Area: Raleigh ›
Categoría: Empleos ›
Subcategoría: Internet ›
DATA ENGINEER SENIOR The PNC Financial Services Group, Inc. has an opening for a Data Engineer Senior in Raleigh, NC. Within the Enterprise Data Management department, the position will be responsible for performing analytical tasks on vast amounts of structured and unstructured data to extract actionable business insights. Specific duties include: (i) ingesting large volumes of data from various external sources into Hadoop, and able to build data pipelines (ii) developing IT based Hadoop strategies for business data implementation, date acquisition, execution, reporting, and archive recovery; (iii) advising IT, business, applications, and operations counterparts to ensure data integrity and availability; (iv) developing the PoCs for business problems leveraging Hadoop eco system; and (v) educating business community users with Hadoop best practices and latest techniques to optimize the code. Master's degree in Technology, Analytics, Engineering, Mathematics, Information Technology, or Information Systems Management plus 3 years of experience conducting data engineering and analytics on large volumes (Petabytes) of data is required. Experience must include: (i) handling, manipulation and analysis of large datasets (multiple terabytes of data);(ii) leveraging Trifacta or Informatica for data preparation and wrangling activities; (iii) utilizing data query tools including HQL, SQL, R, and Python to manipulate, analyze and interpret data; (iv) writing and implementing code/software to clean and transform multiple terabytes of unstructured data sets with numerical and textual data; (v) building data pipelines using pySpark, to prepare the consumable datasets for data analytics and reporting; (vi) integrating with external data sources to identify interesting and relevant trends; (vii) programmatically extracting data from a Hadoop database and transforming the data into presentable form such as an ROC curve, map, or Tableau visualization; (viii) ingesting real time data from Kafka streaming platform into Hadoop and preparing consumable datasets; (ix) converting existing complex SQL code from Teradata/SAS/Oracle into Hive Query language/pySpark; (x) optimizing and troubleshooting the Yarn/Impala applications; (xi) providing training to hundreds of business use
Area: Raleigh ›
Categoría: Empleos ›
Subcategoría: Internet ›
Contacte al anunciante