Author Image
Hello

I'm Ravi Kumar

Sr. Big Data Engineer

Hello! I’m Ravi Kumar.Highly skilled Big Data Developer with extensive experience in designing, developing and implementing large-scale data solutions. Seeking a challenging position in a dynamic organization where I can utilize my expertise in Hadoop, Spark, and other Big Data technologies to drive innovative solutions nd achieve business objectives.

Professional Skills

Hadoop Distribution
Cloud Services(GCP,AWS)
Hive
Apache Sqoop
Apache Kafka
Data Warehouse Migration
Apache Airflow
Apache NIFI
Docker

My Experience

  • My Contribution in Hadoop World

  • Sqoop Importer

    I have built Sqoop Importer Tool for transfer data between Hadoop (HDFS) and relational databases. This tool is GUI based and used python language for this tool. I have already release 1st version for this tool and enhancing the feature of this tool.

  • HCL Projects(2017-2021):

  • Roles and Responsibilities:

    My role in this project is of Data Ingestion where I am ingesting multiple data csv/txt/compressed files from Amazon S3 into HDFS and further creating scripts based on the files ingested.

    Responsible for reconciliation of cluster data into AWS S3 environment on weekly basis.

  • Automation of Ingestion process through python scripting.


    Responsible for debugging user queries and correcting them.


    DQM checks are performed on the ingested data as per provided summary sheet from the vendors.

  • Data movement from one cluster to production cluster.


    Perform reconciliation on the data ingested in Ingestion cluster into s3.


    Performing extensive cleaning and checking raw data provided by vendors.

  • Joined Big Data Practice Team

  • Part of COE bigdata practice team where get aligned to multiple projects for Installation, configuration, enable security on multiple bigdata tools.


    Involved in multiple RFP and POC activities.


    Automated so many Project based tasks by using python.

    • I have integrated R Studio Tool with HDFS and Actian VH databases.

    • I also have integrated Pentaho Tool with HDFS and Actian VH databases.

    • I also used pyspark as an ETL.

    • I also used Apache Airflow for creating python jobs and scheduling the jobs.

    • I also created a python program for the detecting and correcting(or removing) corrupt or inaccurate records from a record set, table, or database.

    • I wrote python program for the schedule and monitor data pipelines in Apache Airflow.

    • I worked on Actian VH integration with Hadoop cluster and analyzed the Actian VH performance as comparison with respect to Hive and Impala.
    • I worked on BuyWay RFP with my team and contribute some solutions from my side.
    • I also suggested the Hadoop platform architecture for State Govt. of Texas Project.
    • I did Migration of databases from On-Premise to Google Cloud Platfrom BigQuery Data Warehouse.

  • Cross-cluster search: Implemented ELK (Elastic Search, Log stash, Kibana) stack to filter and analyze log data stored on clusters in different data centers.



    Data Profiling: In this activity, I had raw data which contained unwanted value like null, Junk value, missing values and duplicate values. For improving the data quality, I removed all the unwanted values by using SparkSQL and made it suitable for visualization.



    Migrate On-premise database to GCP Big Query : In this activity, I took the backup of required databases and converted all backup files into csv file format by using python script then transferred to Google cloud Storage after that using bq load, pushed all data in Big Query Datawarehouse. And I did Oracle Databases migartion to BigQuery by using Informatica Intelligence Cloud Services.




  • Spark Structured Streaming with Kafka :In this activity, I had 3 nodes Kafka Cluster, 1 node dedicated for Zookeeper and Installed standalone Spark and Hadoop for data processing and Storage the processed data. I got real-time data from weather website by using API and created Kafka Producer to ingest the weather data to Kafka topic. Then I used structured streaming for processing data and stored into HDFS. For the Visualization, I have used Kibana.




    Spark Structured Streaming with Kafka :This activity I have done for one of the cab service provider industry. In this activity, I had a python application which generated the cab drivers related information’s in real-time. I have created custom Kafka Producer and using Structured streaming, processed the data for checking the driver’s availability like total numbers cab drivers, how many drivers are on duty etc.



  • Joined KIA Motors India(Aug. 2021 - June 2022)

  • Joined KIA Motors India

    • I joined KIA Motors India as Data Engineer in Data Science Team. Here I am practicsing big data ecosystems and closely working with Data Scientist.
    • Migrated huge amount of data from the different data sources to KIA India Hadoop cluster which were hosted on the private cloud.
    • Created data ingestion pipelines to ingest the data on the daily/monthly basis.

  • Joined IntraEdge Inc(June 2022 - Till)

  •  In this organization, I joined as senior Big data developer and perfroming code devlopment, automating data pipeline and analysing data to project over business intelligence tool

  • Latest Trainings and Certifications


    Associate Cloud Engineer Certified from Google.
    • Certification course from NIIT in Object Oriented Programming.
    • Certification course in Java from Shadow Infosystem.
    • Attended Amazon day conference.
    • Attended training on GCP in GOOGLE.

  • -->

    My Interest

    First of all I love music, Romantic music is my favorite. Love watching Technical Videos, movies, Web Series and playing games with my buddies. I spend quite a lot of time in traveling and photography, these keeps me fresh for working environment.When I feel free, also spend time in cooking and enjoying with friends.

    • Music
    • Car Driving
    • Photo
    • Football
    • Traveling
    • Movies

    From My Blog:Codearmyforce

    No posts.
    No posts.

    Get in Touch

    Name*


    Message*


    Feel free To Contact

    If you have any questions about our service, feel free to contact us anytime. Simply use the form to the left, or one of the methods below.

    • Noida,India.
    • +91-9354567799, +91-9354567799
    • erravicsekumar@gmail.com
    • erravicsekumar@gmail.com