Why Collaborate with Cuelebre
- Customer was having a problem to secure multi lingual datasets from
different part of the internal and external government agency and
analyse it to predict the JOB market gap and to do recommendation
career path. - We at Cuelebre provided a team to support both development of ETL
jobs and setting up their Data Lake platform in on-premises
environment using cloudera data platform.
Challenges/Hidden Anomalies
There are lot of challenges the team was facing while setting up the Cloudera
data platform and building the data lake pipelines to secure more than 40+
sources from internal and external government agency to predict JOB market.
Problems like:
- Multi language data sets to be analysed due to immigrants.
- Legacy data sets which doesn’t integrate out of the box with data lake.
- Due to privacy reasons lot of law and regulation on the data and platform which need a re-installation.
- Different type and frequency of datasets from different source systems.
- Lot of manual processing of data sets by existing team.
Domain Handling Walkthrough
- Cuelebre suggested to re-build the existing platform from Hortonwork
to Cloudera data platform to fulfil the existing installation issue and to
address the legal obligations. - We also recommended and build a utility in Spark + Delta lake
concepts to load and ingest different data pipelines.
Optimized Solutions/Results/Client Satisfaction
- Platform is built newly on the secured guidelines given by legal and IT
Security. - We have reduced 90% of the manual work by an employee to use data
lake and visualize it in reports rather than excel files. - We helped to secure the data points in data lake to predict the Un-employment rate in Sweden due to Covid-19
- We build a series of pipeline which support to take the decision making whether the Job seeker near to Job market or need support