Type: | Package |
Title: | What Skills and Qualifications are Required for Data Science Related Jobs? |
Version: | 2.0.0 |
Maintainer: | Thiyanga S. Talagala <ttalagala@sjp.ac.lk> |
Description: | Dataset containing information about job listings for data science job roles. |
License: | CC BY 4.0 |
URL: | https://github.com/thiyangt/DSjobtracker |
Encoding: | UTF-8 |
LazyData: | true |
LazyDataCompression: | xz |
RoxygenNote: | 7.2.3 |
Depends: | R (≥ 3.5.0) |
Suggests: | knitr, rmarkdown, tibble, tidyr, ggplot2, dplyr, magrittr, testthat, wordcloud2, forcats, viridis |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2023-12-09 07:21:35 UTC; thiyangashaminitalagala |
Author: | Thiyanga S. Talagala
|
Repository: | CRAN |
Date/Publication: | 2023-12-09 07:40:02 UTC |
Data Scientists/Data Analyst/ Statistician Job Tracker
Description
Job advertisements
Usage
DSraw
Format
A data frame with 551 rows and 152 variables
- ID
row id
- Consultant
Name of the consultant
- DateRetrieved
Date of Data Retrieved
- DatePublished
Published Date of the Advertisement
- Job_title
Name of the job category
- Company
Name of the Company
- R
If R is required -> 1 ,If not mentioned -> 0
- SAS
If SAS is required -> 1 , If not mentioned -> 0
- SPSS
If SPSS is required -> 1 , If not mentioned -> 0
- Python
If Python is required -> 1 , If not mentioned -> 0
- MAtlab
If Matlab is required -> 1 , If not mentioned -> 0
- Scala
If Scala is required -> 1 , If not mentioned -> 0
- C#
If C# is required -> 1 , If not mentioned -> 0
- MS Word
If knowledge in MS Word is required -> 1 , If not mentioned -> 0
- Ms Excel
If knowledge in MS Excel is required -> 1 , If not mentioned -> 0
- OLE/DB
If knowledge in OLE/DB is required -> 1 , If not mentioned -> 0
- Ms Access
If Ms Access is required -> 1 , If not mentioned -> 0
- Ms PowerPoint
If knowledge in Ms Powerpoint is required -> 1 , If not mentioned -> 0
- Spreadsheets
If knowledge in Spreadsheets is required -> 1 , If not mentioned -> 0
- Data_visualization
If knowledge inData Visualization is required -> 1 , If not mentioned -> 0
- Presentation_Skills
If Presentation Skills are required -> 1 , If not mentioned -> 0
- Communication
If Communication skills are required -> 1 , If not mentioned -> 0
- BigData
If knowledge in Big Data analysis is required -> 1 , If not mentioned -> 0
- Data_warehouse
If knowledge in Data Warehouse is required -> 1 , If not mentioned -> 0
- cloud_storage
If knowledge in Cloud Storage is required -> 1 , If not mentioned -> 0
- Google_Cloud
If knowledge in Google Cloud is required -> 1 , If not mentioned -> 0
- AWS
If knowledge in AWS is required -> 1 , If not mentioned -> 0
- Machine_Learning
If knowledge in Machine Learning is required -> 1 , If not mentioned -> 0
- Deep Learning
If knowledge in Deep Learning is required -> 1 , If not mentioned -> 0
- Computer_vision
If knowledge in Computer Vision is required -> 1 , If not #' mentioned -> 0
- Java
If Java is required -> 1 , If not mentioned -> 0
- C++
If C++ is required -> 1 , If not mentioned -> 0
- C
If C is required -> 1 , If not mentioned -> 0
- Linux/Unix
If knowledge in Linux/Unix is required -> 1 , If not mentioned -> 0
- SQL
If SQL is required -> 1 , If not mentioned -> 0
- NoSQL
If NoSQL is required -> 1 , If not mentioned -> 0
- RDBMS
If knowledge in RDBMS is required -> 1 , If not mentioned -> 0
- Oracle
If knowledge in Oracle is required -> 1 , If not mentioned -> 0
- MySQL
If MYSQL is required -> 1 , If not mentioned -> 0
- PHP
If PHP is required -> 1 , If not mentioned -> 0
- Flash_Actionscript
If knowledge in Flash Action Script is required -> 1 , If not mentioned -> 0
- SPL
If knowledge in SPL is required -> 1 , If not mentioned -> 0
- web_design_and_development_tools
If knowledge in Web Design and Development Tools is required -> 1 , If not mentioned -> 0
- Wordpress
If knowledge in Wordpress is required -> 1 , If not mentioned -> 0
- AI
If Artificial Intelligence is required -> 1 , If not mentioned -> 0
- Natural_Language_Processing(NLP)
If knowledge in NLP is required -> 1 , If not mentioned -> 0
- Microsoft Power BI
If knowledge in Microsoft Power BI is required -> 1 , If not mentioned -> 0
- Google_Analytics
If knowledge in Google Analytics is required -> 1 , If not mentioned -> 0
- graphics_and_design_skills
If Graphic and Design Skills are required -> 1 , If not mentioned -> 0
- Data_marketing
If Data Marketing abillity is required -> 1 , If not mentioned -> 0
- SEO
If knowledge in SEO is required -> 1 , If not mentioned -> 0
- Content_Management
If knowledge in Content Management is required -> 1 , If not mentioned -> 0
- Tableau
If knowledge in Tableau is required -> 1 , If not mentioned -> 0
- D3
If knowledge in D3 is required -> 1 , If not mentioned -> 0
- Alteryx
If knowledge in Alteryx is required -> 1 , If not mentioned -> 0
- KNIME
If knowledge in KNIME is required -> 1 , If not mentioned -> 0
- Spotfire
If knowledge in Spotfire is required -> 1 , If not mentioned -> 0
- Spark
If knowledge in Spark is required -> 1 , If not mentioned -> 0
- S3
If knowledge in S3 is required -> 1 , If not mentioned -> 0
- Redshift
If knowledge in Redshift is required -> 1 , If not mentioned -> 0
- DigitalOcean
If knowledge in Digital Ocean is required -> 1 , If not mentioned -> 0
- Javascript
If Java Script is required -> 1 , If not mentioned -> 0
- Kafka
If knowledge in Kafka is required -> 1 , If not mentioned -> 0
- Storm
If knowledge in Storm is required -> 1 , If not mentioned -> 0
- Bash
If knowledge in Bash is required -> 1 , If not mentioned -> 0
- Hadoop
If knowledge in Hadoop is required -> 1 , If not mentioned -> 0
- Data_Pipelines
If knowledge in Data Pipelines is required -> 1 , If not mentioned -> 0
- MPP_Platforms
If MPP Platforms is required ->1,If not mentioned-0
- Qlik
If Qlik is required ->1,If not mentioned ->0
- Pig
If Pig is required ->1,If not mentioned ->0
- Hive
If Hive is required ->1,If not mentioned ->0
- Tensorflow
If Tensorflow is required ->1,If not mentioned ->0
- Map/Reduce
If Map/Reduce is required ->1,If not mentioned ->0
- Impala
If Impala is required ->1,If not mentioned ->0
- Solr
If Sloris required ->1,If not mentioned ->0
- Teradata
If Teradata is required ->1,If not mentioned ->0
- MongoDB
If MonoDB is required ->1,If not mentioned ->0
- Elasticsearch
If Elasticsearch is required ->1,If not mentioned ->0
- YOLO
If YOLO is required-1 ,If not mentioned-0
- agile execution
If agile execution is required->1 ,If not mentioned->0
- Data_management
If the knowledge in data management is required->1 ,If not mentioned->0
- pyspark
If pyspark is required->1 ,If not mentioned->0
- Data_mining
If the knowledge in data mining is required->1 ,If not mentioned->0
- Data_science
If the knowledge in data science is required->1 ,If not mentioned->0
- Web_Analytic_tools
If the knowledge in Web Analytic tools is required->1 ,If not mentioned->0
- IOT
If IOT is required->1 ,If not mentioned->0
- Numerical_Analysis
If the knowledge in Numerical Analysis is required->1 ,If not mentioned->0
- Economic
If the knowledge in Economic is required->1 ,If not mentioned->0
- Finance_Knowledge
If Finance_Knowledge is required->1 ,If not mentioned->0
- Investment_Knowledge
If Investment Knowledge is required->1 ,If not mentioned->0
- Problem_Solving
If the ability of Problem Solving is required->1 ,If not mentioned->0
- Korean_language
If the ability of speaking Korean language is required->1 ,If not mentioned->0
- Bash\Linux Scripting
If Bash\ Linux Scripting is required->1 ,If not mentioned->0
- Knowledge_in
Required knowledge to do a particular job ,If not mentioned->NA
- Experience
Minimum experience required for a particular job
- City
City where the company is located in
- Location
Country where the company is located in
- Educational_qualifications
Required educational qualifications
- Salary
Amount of salary
- Team_Handling
If the ability of Team Handling is required-1 ,If not mentioned-0
- Debtor_reconcilation
If the ability of Debtor reconciliation is required-1 ,If not mentioned-0
- Payroll_management
If the ability of Payroll management is required-1 ,If not mentioned-0
- Bayesian
If Bayesian knowledge is required-1 ,If not mentioned-0
- Optimization
If Optimization knowledge is required-1 ,If not mentioned-0
- Bahasa Malaysia
If Bahasa Malaysia is required-1 ,If not mentioned-0
- English proficiency
If English proficiency is required-1 ,If not mentioned-0
- URL
Web address of a particular job advertisement
- Search_Term
web search term of a particular job advertisement
- X109
Columns with null values
- X110
Columns with null values
- X111
Columns with null values
- X112
Columns with null values
- X113
Columns with null values
- X114
Columns with null values
- X115
Columns with null values
- X116
Columns with null values
- X117
Columns with null values
- X118
Columns with null values
- X119
Columns with null values
- X120
Columns with null values
- X121
Columns with null values
- X122
Columns with null values
- X123
Columns with null values
- X124
Columns with null values
- X125
Columns with null values
- X126
Columns with null values
- X127
Columns with null values
- X128
Columns with null values
- X129
Columns with null values
- X130
Columns with null values
- X131
Columns with null values
- X132
Columns with null values
- X133
Columns with null values
- X134
Columns with null values
- X135
Columns with null values
- X136
Columns with null values
- X137
Columns with null values
- X138
Columns with null values
- X139
Columns with null values
- X140
Columns with null values
- X141
Columns with null values
- X142
Columns with null values
- X143
Columns with null values
- X144
Columns with null values
- X145
Columns with null values
- X146
Columns with null values
- X147
Columns with null values
- X148
Columns with null values
- X149
Columns with null values
- X150
Columns with null values
- X151
Columns with null values
- X152
Columns with null values
Source
Collected and entered by BSc (Hons) Statistics undegraduates - 2020
Examples
data(DSraw)
head(DSraw)
summary(DSraw)
Data scientists, data analyst, and statistician job advertisements from 2020 to 2023
Description
A dataset with 1172 rows and 109 variables
Usage
data(DStidy)
Details
ID. row id
Consultant. Name of the consultant
DateRetrieved. Date of Data Retrieved
DatePublished. Published Date of the Advertisement
Job_title. Name of the job category
Company. Name of the Company
R. If R is required -> 1 ,If not mentioned -> 0
SAS. If SAS is required -> 1 , If not mentioned -> 0
SPSS. If SPSS is required -> 1 , If not mentioned -> 0
Python. If Python is required -> 1 , If not mentioned -> 0
MAtlab. If Matlab is required -> 1 , If not mentioned -> 0
Scala. If Scala is required -> 1 , If not mentioned -> 0
C#. If C# is required -> 1 , If not mentioned -> 0
MS Word. If knowledge in MS Word is required -> 1 , If not mentioned -> 0
Ms Excel. If knowledge in MS Excel is required -> 1 , If not mentioned -> 0
OLE/DB. If knowledge in OLE/DB is required -> 1 , If not mentioned -> 0
Ms Access. If Ms Access is required -> 1 , If not mentioned -> 0
Ms PowerPoint. If knowledge in Ms Powerpoint is required -> 1 , If not mentioned -> 0
Spreadsheets. If knowledge in Spreadsheets is required -> 1 , If not mentioned -> 0
Data_visualization. If knowledge in Data Visualization is required -> 1 , If not mentioned -> 0
Presentation_Skills. If Presentation Skills are required -> 1 , If not mentioned -> 0
Communication. If Communication skills are required -> 1 , If not mentioned -> 0
BigData. If knowledge in Big Data analysis is required -> 1 , If not mentioned -> 0
Data_warehouse. If knowledge in Data Warehouse is required -> 1 , If not mentioned -> 0
cloud_storage. If knowledge in Cloud Storage is required -> 1 , If not mentioned -> 0
Google_Cloud. If knowledge in Google Cloud is required -> 1 , If not mentioned -> 0
AWS. If knowledge in AWS is required -> 1 , If not mentioned -> 0
Machine_Learning. If knowledge in Machine Learning is required -> 1 , If not mentioned -> 0
Deep Learning. If knowledge in Deep Learning is required -> 1 , If not entioned -> 0
Computer_vision. If knowledge in Computer Vision is required -> 1 , If not mentioned -> 0
Java. If Java is required -> 1 , If not mentioned -> 0
C++. If C++ is required -> 1 , If not mentioned -> 0
C. If C is required -> 1 , If not mentioned -> 0
Linux/Unix. If knowledge in Linux/Unix is required -> 1 , If not mentioned -> 0
SQL. If SQL is required -> 1 , If not mentioned -> 0
NoSQL. If NoSQL is required -> 1 , If not mentioned -> 0
RDBMS. If knowledge in RDBMS is required -> 1 , If not mentioned -> 0
Oracle. If knowledge in Oracle is required -> 1 , If not mentioned -> 0
MySQL. If MYSQL is required -> 1 , If not mentioned -> 0
PHP. If PHP is required -> 1 , If not mentioned -> 0
Flash_Actionscript. If knowledge in Flash Action Script is required -> 1 , If not mentioned -> 0
SPL. If knowledge in SPL is required -> 1 , If not mentioned -> 0
web_design_and_development_tools. If knowledge in Web Design and Development Tools is required -> 1 , If not mentioned -> 0
Wordpress. If knowledge in Wordpress is required -> 1 , If not mentioned -> 0
AI. If Artificial Intelligence is required -> 1 , If not mentioned -> 0
Natural_Language_Processing(NLP). If knowledge in NLP is required -> 1 , If not mentioned -> 0
Microsoft Power BI. If knowledge in Microsoft Power BI is required -> 1 , If not mentioned -> 0
Google_Analytics. If knowledge in Google Analytics is required -> 1 , If not mentioned -> 0
graphics_and_design_skills. If Graphic and Design Skills are required -> 1 , If not mentioned -> 0
Data_marketing. If Data Marketing abillity is required -> 1 , If not mentioned -> 0
SEO. If knowledge in SEO is required -> 1 , If not mentioned -> 0
Content_Management. If knowledge in Content Management is required -> 1 , If not mentioned -> 0
Tableau. If knowledge in Tableau is required -> 1 , If not mentioned -> 0
D3. If knowledge in D3 is required -> 1 , If not mentioned -> 0
Alteryx. If knowledge in Alteryx is required -> 1 , If not mentioned -> 0
KNIME. If knowledge in KNIME is required -> 1 , If not mentioned -> 0
Spotfire. If knowledge in Spotfire is required -> 1 , If not mentioned -> 0
Spark. If knowledge in Spark is required -> 1 , If not mentioned -> 0
S3. If knowledge in S3 is required -> 1 , If not mentioned -> 0
Redshift. If knowledge in Redshift is required -> 1 , If not mentioned -> 0
DigitalOcean. If knowledge in Digital Ocean is required -> 1 , If not mentioned -> 0
Javascript. If Java Script is required -> 1 , If not mentioned -> 0
Kafka. If knowledge in Kafka is required -> 1 , If not mentioned -> 0
Storm. If knowledge in Storm is required -> 1 , If not mentioned -> 0
Bash. If knowledge in Bash is required -> 1 , If not mentioned -> 0
Hadoop. If knowledge in Hadoop is required -> 1 , If not mentioned -> 0
Data_Pipelines. If knowledge in Data Pipelines is required -> 1 , If not mentioned -> 0
MPP_Platforms. If MPP Platforms is required ->1,If not mentioned-0
Qlik. If Qlik is required ->1,If not mentioned ->0
Pig. If Pig is required ->1,If not mentioned ->0
Hive. If Hive is required ->1,If not mentioned ->0
Tensorflow. If Tensorflow is required ->1,If not mentioned ->0
Map/Reduce. If Map/Reduce is required ->1,If not mentioned ->0
Impala. If Impala is required ->1,If not mentioned ->0
Solr. If Sloris required ->1,If not mentioned ->0
Teradata. If Teradata is required ->1,If not mentioned ->0
MongoDB. If MonoDB is required ->1,If not mentioned ->0
Elasticsearch. If Elasticsearch is required ->1,If not mentioned ->0
YOLO. If YOLO is required-1 ,If not mentioned-0
agile execution. If agile execution is required->1 ,If not mentioned->0
Data_management. If the knowledge in data management is required->1 ,If not mentioned->0
pyspark. If pyspark is required->1 ,If not mentioned->0
Data_mining. If the knowledge in data mining is required->1 ,If not mentioned->0
Data_science. If the knowledge in data science is required->1 ,If not mentioned->0
Web_Analytic_tools. If the knowledge in Web Analytic tools is required->1 ,If not mentioned->0
IOT. If IOT is required->1 ,If not mentioned->0
Numerical_Analysis. If the knowledge in Numerical Analysis is required->1 ,If not mentioned->0
Economic. If the knowledge in Economic is required->1 ,If not mentioned->0
Finance_Knowledge. If Finance_Knowledge is required->1 ,If not mentioned->0
Investment_Knowledge. If Investment Knowledge is required->1 ,If not mentioned->0
Problem_Solving. If the ability of Problem Solving is required->1 ,If not mentioned->0
Team_Handling. If the ability of Team Handling is required->1 ,If not mentioned->0
Debtor_reconcilation. If the ability of Debtor reconcilation is required->1 ,If not mentioned->0
Payroll_management. If Payroll management is required->1 ,If not mentioned->0
Bayesian. If Bayesian is required->1 ,If not mentioned->0
Optimization. If Optimization knowledge is required-1 ,If not mentioned-0
Knowledge_in. Required knowledge to do a particular job ,If not mentioned->NA
City. City where the company is located in
Educational_qualifications. Required educational qualifications
Salary. Amount of salary
URL. Web address of a particular job advertisement
Search_Term. web search term of a particular job advertisement
Job_Category. Category of the job (i.e. "Data Science","Data Analyst" etc.)
Team_Handling. If the ability of Team Handling is required-1 ,If not mentioned-0
Debtor_reconcilation. If the ability of Debtor reconciliation is required-1 ,If not mentioned-0
Payroll_management. If the ability of Payroll management is required-1 ,If not mentioned-0
Bayesian. If Bayesian knowledge is required-1 ,If not mentioned-0
Bahasa_Malaysia. If Bahasa Malaysia is required-1 ,If not mentioned-0
English_proficiency. If English proficiency is required-1 ,If not mentioned-0
Experience_Category. Number of years of experience in binned into categories
Location. Location
Payment Frequency. Payment frequency
BSc_needed. If BSc is required-1 ,If not mentioned-0
MSc_needed. If MSc is required-1 ,If not mentioned-0
PhD_needed. If PhD is required-1 ,If not mentioned-0
English Needed. If English is required-1 ,If not mentioned-0
year. Survey year
Source
Data collection was done, BSc (Hons)Staistics, University of Sri Jayewardenepura under the statistical consultancy service from 2020 to 2023.
Data scientists, data Analyst, and statistician related job advertisements in 2020
Description
A dataset with 430 rows and 115 columns
Usage
data(DStidy_2020)
Details
ID. Row id
Consultant. Name of the consultant
DateRetrieved. Date of data retrieved
DatePublished. Published date of the advertisement
Job_title. Name of the job category
Company. Name of the company
R. If R is required -> 1 , If not mentioned -> 0
SAS. If SAS is required -> 1 , If not mentioned -> 0
SPSS. If SPSS is required -> 1 , If not mentioned -> 0
Python. If Python is required -> 1 , If not mentioned -> 0
MAtlab. If MAtlab is required -> 1 , If not mentioned -> 0
Scala. If Scala is required -> 1 , If not mentioned -> 0
C_Sharp. If C_Sharp is required -> 1 , If not mentioned -> 0
Ms_Excel. If Ms_Excel is required -> 1 , If not mentioned -> 0
OLE_DB. If OLE_DB is required -> 1 , If not mentioned -> 0
Ms_Access. If Ms_Access is required -> 1 , If not mentioned -> 0
Ms_PowerPoint. If Ms_PowerPoint is required -> 1 , If not mentioned -> 0
Spreadsheets. If Spreadsheets is required -> 1 , If not mentioned -> 0
Data_visualization. If knowledge in Data Visualization is required -> 1 , If not mentioned -> 0
Presentation_Skills. If Presentation Skills are required -> 1 , If not mentioned -> 0
Communication. If Communication skills are required -> 1 , If not mentioned -> 0
BigData. If knowledge in Big Data analysis is required -> 1 , If not mentioned -> 0
Data_warehouse. If knowledge in Data Warehouse is required -> 1 , If not mentioned -> 0
cloud_storage. If knowledge in Cloud Storage is required -> 1 , If not mentioned -> 0
Google_Cloud. If knowledge in Google Cloud is required -> 1 , If not mentioned -> 0
AWS. If knowledge in AWS is required -> 1 , If not mentioned -> 0
Machine_Learning. If knowledge in Machine Learning is required -> 1 , If not mentioned -> 0
Deep_Learning. If knowledge in Deep Learning is required -> 1 , If not mentioned -> 0
Computer_vision. If knowledge in Computer Vision is required -> 1 , If not mentioned -> 0
Java. If Java is required -> 1 , If not mentioned -> 0
Cpp. If Cpp is required -> 1 , If not mentioned -> 0
C. If C is required -> 1 , If not mentioned -> 0
Linux_Unix. If knowledge in Linux/Unix is required -> 1 , If not mentioned -> 0
SQL. If SQL is required -> 1 , If not mentioned -> 0
NoSQL. If NoSQL is required -> 1 , If not mentioned -> 0
RDBMS. If knowledge in RDBMS is required -> 1 , If not mentioned -> 0
Oracle. If knowledge in Oracle is required -> 1 , If not mentioned -> 0
MySQL. If MYSQL is required -> 1 , If not mentioned -> 0
PHP. If PHP is required -> 1 , If not mentioned -> 0
Flash_Actionscript. If Flash_Actionscript is required -> 1 , If not mentioned -> 0
SPL. If knowledge in SPL is required -> 1 , If not mentioned -> 0
web_design_and_development_tools. If knowledge in Web Design and Development Tools is required -> 1 , If not mentioned -> 0
Wordpress. If Wordpress is required -> 1 , If not mentioned -> 0
AI. If AI is required 1 , If not mentioned 0
Natural_Language_Processing(NLP). If knowledge in NLP is required -> 1 , If not mentioned -> 0
Microsoft_Power_BI. If knowledge in Microsoft Power BI is required -> 1 , If not mentioned -> 0
Google_Analytics. If knowledge in Google Analytics is required -> 1 , If not mentioned -> 0
graphics_and_design_skills. If Graphic and Design Skills are required -> 1 , If not mentioned -> 0
Data_marketing. If Data Marketing abillity is required -> 1 , If not mentioned -> 0
SEO. If knowledge in SEO is required -> 1 , If not mentioned -> 0
Content_Management. If knowledge in Content Management is required -> 1 , If not mentioned -> 0
Tableau. If knowledge in Tableau is required -> 1 , If not mentioned -> 0
D3. If knowledge in D3 is required -> 1 , If not mentioned -> 0
Alteryx. If knowledge in Alteryx is required -> 1 , If not mentioned -> 0
KNIME. If knowledge in KNIME is required -> 1 , If not mentioned -> 0
Spotfire. If knowledge in Spotfire is required -> 1 , If not mentioned -> 0
Spark. If knowledge in Spark is required -> 1 , If not mentioned -> 0
S3. If knowledge in S3 is required -> 1 , If not mentioned -> 0
Redshift. If knowledge in Redshift is required -> 1 , If not mentioned -> 0
DigitalOcean. If knowledge in Digital Ocean is required -> 1 , If not mentioned -> 0
Javascript. If Java Script is required -> 1 , If not mentioned -> 0
Kafka. If knowledge in Kafka is required -> 1 , If not mentioned -> 0
Storm. If knowledge in Storm is required -> 1 , If not mentioned -> 0
Bash. If knowledge in Bash is required -> 1 , If not mentioned -> 0
Hadoop. If knowledge in Hadoop is required -> 1 , If not mentioned -> 0
Data_Pipelines. If knowledge in Data Pipelines is required -> 1 , If not mentioned -> 0
MPP_Platforms. If MPP Platforms is required -> 1 , If not mentioned -> 0
Qlik. If Qlik is required -> 1 , If not mentioned -> 0
Pig. If Pig is required -> 1 , If not mentioned -> 0
Hive. If Hive is required -> 1 , If not mentioned -> 0
Tensorflow. If Tensorflow is required -> 1 , If not mentioned -> 0
Map_Reduce. If Map/Reduce is required -> 1 , If not mentioned -> 0
Impala. If Impala is required -> 1 ,If not mentioned -> 0
Solr. If Sloris required -> 1 , If not mentioned -> 0
Teradata. If Teradata is required -> 1 , If not mentioned -> 0
MongoDB. If MonoDB is required -> 1 , If not mentioned -> 0
Elasticsearch. If Elasticsearch is required -> 1, If not mentioned -> 0
YOLO. If YOLO is required -> 1, If not mentioned -> 0
agile_execution. If agile execution is required -> 1 , If not mentioned -> 0
Data_management. If the knowledge in Data Management is required -> 1 , If not mentioned -> 0
pyspark. If pyspark is required -> 1 , If not mentioned -> 0
Data_mining. If the knowledge in Data Mining is required -> 1 , If not mentioned -> 0
Data_science. If the knowledge in Data Science is required -> 1 , If not mentioned -> 0
Web_Analytic_tools. If the knowledge in Web Analytic tools is required -> 1 , If not mentioned -> 0
IOT. If IOT is required -> 1 , If not mentioned -> 0
Numerical_Analysis. If the knowledge in Numerical Analysis is required -> 1 , If not mentioned -> 0
Economic. If the knowledge in Economic is required -> 1 , If not mentioned -> 0
Finance_Knowledge. If Finance_Knowledge is required -> 1 , If not mentioned -> 0
Investment_Knowledge. If Investment Knowledge is required -> 1 , If not mentioned -> 0
Problem_Solving. If the ability of Problem Solving is required -> 1 , If not mentioned -> 0
Korean_language. If the ability of Korean language is required -> 1 , If not mentioned -> 0
Bash_Linux_Scripting. If Bash Linux Scripting is required -> 1 , If not mentioned -> 0
Team_Handling. If the ability of Team Handling is required -> 1 , If not mentioned -> 0
Debtor_reconcilation. If the ability of Debtor reconciliation is required -> 1 , If not mentioned -> 0
Payroll_management. If the ability of Payroll management is required -> 1 , If not mentioned -> 0
Bayesian. If Bayesian knowledge is required -> 1 , If not mentioned -> 0
Optimization. If Optimization knowledge is required -> 1 ,If not mentioned -> 0
Bahasa_Malaysia. If Bahasa_Malaysia knowledge is required -> 1 ,If not mentioned -> 0
Knowledge_in. Required knowledge to do a particular job , If not mentioned -> NA
City. City where the company is located in , If not mentioned -> NA
Location. Country where the company is located in
Educational_qualifications. Required educational qualifications
Salary. Salary
English_proficiency. English proficiency
URL. URL of the job advertisement
Search_Term. Search Term
Job_Category. Name of the job category
Minimum_Years_of_experience. Minimum years of experience needed for the job , If not mentioned -> NA
Experience. Experience
Experience_Category. Experience category
Job_Country. Job country
Edu_Category. Education category
Minimum_Salary. Minimum salary
Salary_BasisSalary. basis
Source
Data wrangling part was done by Janith C. Wanniarachchie, BSc (Hons)Staistics, University of Sri Jayewardenepura and description file was prepared by Randi Shashikala.
Data Scientists/Data Analyst/ Statistician Job Advertisements in the year 2021
Description
Job advertisements collected in the year 2021
Usage
DStidy_2021
Format
A data frame with 382 rows and 115 columns
- ID
Row id
- Consultant
Name of the consultant
- URL
Web address of a particular job advertisement
- Search_Term
Web search term of a particular job advertisement
- DateRetrieved
Date of data retrieved
- DatePublished
Published date of the advertisement
- Job_Field
Name of the related job field
- Job_title
Name of the job category
- Company
Name of the company
- Knowledge_in
Required knowledge to do a particular job , If not mentioned -> NA
- Minimum Experience in Years
Minimum years of experience needed for the job , If not mentioned -> NA
- City
City where the company is located in , If not mentioned -> NA
- Location
Country where the company is located in
- Educational_qualifications
Required educational qualifications
- Payment Frequency
Payment basis of salary(i.e. "hourly","daily","monthly","yearly", "NA")
- Currency
Currency type of the salary
- Salary
Amount of salary
- English Needed
If English proficiency is required -> 1 , If not mentioned -> 0
- English proficiency description
Required level of English proficiency , If not mentioned -> NA
- Additional_languages
If other lanuages except English is required -> 1 , If not mentioned -> NA
- AI
If Artificial Intelligence is required -> 1 , If not mentioned -> 0
- Natural_Language_Processing(NLP)
If knowledge in NLP is required -> 1 , If not mentioned -> 0
- Data_Pipelines
If knowledge in Data Pipelines is required -> 1 , If not mentioned -> 0
- Machine_Learning
If knowledge in Machine Learning is required -> 1 , If not mentioned -> 0
- Deep Learning
If knowledge in Deep Learning is required -> 1 , If not mentioned -> 0
- Computer_vision
If knowledge in Computer Vision is required -> 1 , If not mentioned -> 0
- Data_visualization
If knowledge in Data Visualization is required -> 1 , If not mentioned -> 0
- Data_warehouse
If knowledge in Data Warehouse is required -> 1 , If not mentioned -> 0
- BigData
If knowledge in Big Data analysis is required -> 1 , If not mentioned -> 0
- Data_management
If the knowledge in Data Management is required -> 1 , If not mentioned -> 0
- Data_mining
If the knowledge in Data Mining is required -> 1 , If not mentioned -> 0
- Data_science
If the knowledge in Data Science is required -> 1 , If not mentioned -> 0
- Bayesian
If Bayesian knowledge is required -> 1 , If not mentioned -> 0
- Optimization
If Optimization knowledge is required -> 1 ,If not mentioned -> 0
- Numerical_Analysis
If the knowledge in Numerical Analysis is required -> 1 , If not mentioned -> 0
- IOT
If IOT is required -> 1 , If not mentioned -> 0
- Data_translation
If the knowledge in Data Translation is required -> 1 , If not mentioned -> 0
- R
If R is required -> 1 ,If not mentioned -> 0
- SAS
If SAS is required -> 1 , If not mentioned -> 0
- SPSS
If SPSS is required -> 1 , If not mentioned -> 0
- Python
If Python is required -> 1 , If not mentioned -> 0
- MAtlab
If Matlab is required -> 1 , If not mentioned -> 0
- Scala
If Scala is required -> 1 , If not mentioned -> 0
- C#
If C# is required -> 1 , If not mentioned -> 0
- Java
If Java is required -> 1 , If not mentioned -> 0
- C++
If C++ is required -> 1 , If not mentioned -> 0
- C
If C is required -> 1 , If not mentioned -> 0
- Bash
If Bash is required -> 1 , If not mentioned -> 0
- Tensorflow
If Tensorflow is required -> 1 , If not mentioned -> 0
- pyspark
If pyspark is required -> 1 , If not mentioned -> 0
- YOLO
If YOLO is required -> , If not mentioned -> 0
- MS Word
If knowledge in MS Word is required -> 1 , If not mentioned -> 0
- Ms Excel
If knowledge in MS Excel is required -> 1 , If not mentioned -> 0
- Ms Access
If Ms Access is required -> 1 , If not mentioned -> 0
- Ms PowerPoint
If knowledge in Ms Powerpoint is required -> 1 , If not mentioned -> 0
- Spreadsheets
If knowledge in Spreadsheets is required -> 1 , If not mentioned -> 0
- Google_Analytics
If knowledge in Google Analytics is required -> 1 , If not mentioned -> 0
- Microsoft Power BI
If knowledge in Microsoft Power BI is required -> 1 , If not mentioned -> 0
- Tableau
If knowledge in Tableau is required -> 1 , If not mentioned -> 0
- D3
If knowledge in D3 is required -> 1 , If not mentioned -> 0
- Qlik
If Qlik is required -> 1 , If not mentioned -> 0
- KNIME
If knowledge in KNIME is required -> 1 , If not mentioned -> 0
- Spotfire
If knowledge in Spotfire is required -> 1 , If not mentioned -> 0
- Linux/Unix
If knowledge in Linux/Unix is required -> 1 , If not mentioned -> 0
- OLE/DB
If knowledge in OLE/DB is required -> 1 , If not mentioned -> 0
- SQL
If SQL is required -> 1 , If not mentioned -> 0
- NoSQL
If NoSQL is required -> 1 , If not mentioned -> 0
- RDBMS
If knowledge in RDBMS is required -> 1 , If not mentioned -> 0
- Oracle
If knowledge in Oracle is required -> 1 , If not mentioned -> 0
- MySQL
If MYSQL is required -> 1 , If not mentioned -> 0
- MongoDB
If MonoDB is required -> 1 , If not mentioned -> 0
- MPP_Platforms
If MPP Platforms is required -> 1 , If not mentioned -> 0
- SPL
If knowledge in SPL is required -> 1 , If not mentioned -> 0
- Alteryx
If knowledge in Alteryx is required -> 1 , If not mentioned -> 0
- Spark
If knowledge in Spark is required -> 1 , If not mentioned -> 0
- Kafka
If knowledge in Kafka is required -> 1 , If not mentioned -> 0
- Hadoop
If knowledge in Hadoop is required -> 1 , If not mentioned -> 0
- Pig
If Pig is required -> 1 , If not mentioned -> 0
- Hive
If Hive is required -> 1 , If not mentioned -> 0
- Map/Reduce
If Map/Reduce is required -> 1 , If not mentioned -> 0
- Impala
If Impala is required -> 1 ,If not mentioned -> 0
- Storm
If knowledge in Storm is required -> 1 , If not mentioned -> 0
- Google_Cloud
If knowledge in Google Cloud is required -> 1 , If not mentioned -> 0
- AWS
If knowledge in AWS is required -> 1 , If not mentioned -> 0
- cloud_storage
If knowledge in Cloud Storage is required -> 1 , If not mentioned -> 0
- S3
If knowledge in S3 is required -> 1 , If not mentioned -> 0
- Redshift
If knowledge in Redshift is required -> 1 , If not mentioned -> 0
- DigitalOcean
If knowledge in Digital Ocean is required -> 1 , If not mentioned -> 0
- Teradata
If Teradata is required -> 1 , If not mentioned -> 0
- Solr
If Sloris required -> 1 , If not mentioned -> 0
- Elasticsearch
If Elasticsearch is required -> 1 , If not mentioned -> 0
- Presentation_Skills
If Presentation Skills are required -> 1 , If not mentioned -> 0
- Communication
If Communication skills are required -> 1 , If not mentioned -> 0
- Problem_Solving
If the ability of Problem Solving is required -> 1 , If not mentioned -> 0
- Team_Handling
If the ability of Team Handling is required -> 1 , If not mentioned -> 0
- agile execution
If agile execution is required -> 1 , If not mentioned -> 0
- Data_marketing
If Data Marketing abillity is required -> 1 , If not mentioned -> 0
- SEO
If knowledge in SEO is required -> 1 , If not mentioned -> 0
- graphics_and_design_skills
If Graphic and Design Skills are required -> 1 , If not mentioned -> 0
- Content_Management
If knowledge in Content Management is required -> 1 , If not mentioned -> 0
- Economic
If the knowledge in Economic is required -> 1 , If not mentioned -> 0
- Finance_Knowledge
If Finance_Knowledge is required -> 1 , If not mentioned -> 0
- Investment_Knowledge
If Investment Knowledge is required -> 1 , If not mentioned -> 0
- Debtor_reconcilation
If the ability of Debtor reconciliation is required -> 1 , If not mentioned -> 0
- Payroll_management
If the ability of Payroll management is required -> 1 , If not mentioned -> 0
- web_design_and_development_tools
If knowledge in Web Design and Development Tools is required -> 1 , If not mentioned -> 0
- PHP
If PHP is required -> 1 , If not mentioned -> 0
- Javascript
If Java Script is required -> 1 , If not mentioned -> 0
- Web_Analytic_tools
If the knowledge in Web Analytic tools is required -> 1 , If not mentioned -> 0
- BSc_needed
If a BSc Degree is required -> Yes , If not mentioned -> No/NA
- MSc_needed
If a MSc Degree is required -> Yes , If not mentioned -> No/NA
- PhD_needed
If a Phd Degree is required -> Yes , If not mentioned -> No/NA
- Country
Country
- country_code
country code
- Job_Category
Job category
Source
Data wrangling part was done by Janith C. Wanniarachchie, BSc (Hons)Staistics, University of Sri Jayewardenepura and description file was prepared by Randi Shashikala.
Get data from DSjobtracker for specific years or all the years combined into one dataset
Description
The DSjobtracker dataset is updated each year through the Statistical Consultancy Service of University of Sri Jayewardenepura. In order to accommodate the structural changes of data this function provides the capability to get the dataset required either combined through out the years or data specific to each year.
Usage
get_data(year)
Arguments
year |
can be either "all" or an year after 2020 (2020,2021,...,etc.) as a numeric value |