Académique Documents
Professionnel Documents
Culture Documents
Phone:+91-8826758744
karirasagar@gmail.com
Profile
Technical Expertise
Java Programming:
PROJECTS:
Abstract:
Basic aim of ELIPS is to track and keep the complete record of parts and services.
The XML files are validated against XSD and successfully parsed records are populated in hive
tables as per the requirements. After hive population of tables, data is pushed into Netezza DB
through sqoop export. Once done, data is moved from STG schema to BASE through Informatica.
All these actions are automated through oozie workflow scheduled once in a day.
Job Role:
Validating XML files against XSD and dumping processed records into HDFS
and Hive tables.
Creating Hive tables and populating data from HDFS directory to table.
Exporting of data from Hive to Netezza through Sqoop Export.
Automation of Process by oozie workflow.
Email Alerts for failure action in workflow.
Features Involvement:
Java code for validating XML files against XSD.
XSLT Programming.
Sqoop export from HDFS to Netezza DB.
SagarKarira
Phone:+91-8826758744
karirasagar@gmail.com
Shell Scripting
Oozie Workflow with email alerts.
Informatica workflow.
ESP scheduler.
Abstract:
Basic aim of SA5 is to organize the json data in standard format that could be used for analysis.
Json files are generated as user clicks on links. Files are then changed to standard format of json
using json serde and custom input format. Records are flattened and hive UDAF is written as per
the requirement to fetch particular data from file which is further loaded into hive table.
Data is imported from DB2 to HDFS through Sqoop import, joined with UDAF data and loaded
into hive table.
Job Role:
Abstract:
Combined Table is a complete process of combining the data from various sources so as to
reduce the existing load on Netezza and considering performance issues while migrating.
SagarKarira
Phone:+91-8826758744
karirasagar@gmail.com
Job Role:
Abstract:
Warranty is a process of migrating the data from DB2 via Informatica to Redshift and
considering performance issues while migrating over cloud.
Abstract:
Currently user had to go through many manual steps to reach end results i.e find a part related
things like PIN, catalog etc..
We have automated that process and dump the data in Redshift, now user had to fire single
query on Redshift table and he will reach expected end results.We have used ORC formats for
hive tables along with compression properties as data was huge and query was never ending
while join. After processing data is dumped into S3 bucket and from there through copy
command it is loaded in redshift table.
SagarKarira
Phone:+91-8826758744
karirasagar@gmail.com
Academic Qualification
Achievements
Interests
Music
Reading about cars
Personal Information
Sagar Karira