Artemio Rimando - Evaluating a life measured in smiles | A data scientist lifestyle blog
  • HOME
  • LET’S COLLABORATE
    • CONTACT ME
    • SUBSCRIBE
    • BIG BROTHER GAMES
      • INQUIRY
      • PAST GAMES
  • ABOUT
    • CREDENTIALS
    • ABOUT
  • DATA SCIENCE
    • PYTHON
    • APPLIED LEARNING
    • SOFT SKILLS
  • CAREER
    • CAREER ADVICE
    • RECRUITMENT ADVICE
  • LIFESTYLE
    • OUT AND PROUD
    • LIFESTYLE
  • TRAVEL
    • HONG KONG
    • MACAU
    • SAN FRANCISCO
HOME
LET'S COLLABORATE
    CONTACT ME
    SUBSCRIBE
    BIG BROTHER GAMES
    INQUIRY
    PAST GAMES
ABOUT
    CREDENTIALS
    ABOUT
DATA SCIENCE
    PYTHON
    APPLIED LEARNING
    SOFT SKILLS
CAREER
    CAREER ADVICE
    RECRUITMENT ADVICE
LIFESTYLE
    OUT AND PROUD
    LIFESTYLE
TRAVEL
    HONG KONG
    MACAU
    SAN FRANCISCO
Artemio Rimando - Evaluating a life measured in smiles | A data scientist lifestyle blog
  • HOME
  • LET’S COLLABORATE
    • CONTACT ME
    • SUBSCRIBE
    • BIG BROTHER GAMES
      • INQUIRY
      • PAST GAMES
  • ABOUT
    • CREDENTIALS
    • ABOUT
  • DATA SCIENCE
    • PYTHON
    • APPLIED LEARNING
    • SOFT SKILLS
  • CAREER
    • CAREER ADVICE
    • RECRUITMENT ADVICE
  • LIFESTYLE
    • OUT AND PROUD
    • LIFESTYLE
  • TRAVEL
    • HONG KONG
    • MACAU
    • SAN FRANCISCO
Analysisβ€’Data Scienceβ€’R

Employee Turnover: A Risk Segmenting Investigation

In this post, I conduct a simple risk analysis of employee turnover using the Human Resources Analytics data set from Kaggle.

I describe this analysis as an example of simple risk segmenting because I would like to have a general idea of which combination of employee characteristics can provide evidence towards higher employee turnover.

To accomplish this, I developed a function in R that will take a data frame and two characteristics of interest in order to generate a matrix whose entries represent the probability of employee turnover given the two characteristics. I call these values, turnover rates.

Human Resources Analytics Data

Firstly, let us go over the details of the human resources analytics data set.


hr_data <- read.csv("HR_comma_sep.csv", header = TRUE)

str(hr_data)

hr_analytics_data_summary

The variables are described as follows:

  • satisfaction_level represents the employee’s level of satisfaction on a 0 – 100% scale
  • last_evaluation represents the employee’s numeric score on their last evaluation
  • number_project is the number of projects accomplished by an employee to date
  • average_montly_hours is the average monthly hours an employee spends at work
  • time_spend_company is the amount of years an employee worked at this company
  • work_accident is a binary variable where 1 the employee experienced an accident, and 0 otherwise
  • left variable represents the binary class where 1 means the employee left, and 0 otherwise.
  • promotion_last_5years is a binary variable where 1 means the employee was promoted in the last 5 years, and 0 otherwise
  • sales is a categorical variable representing the employee’s main job function
  • salary is a categorical variable representing an employee’s salary level

The Rate Function

The following R code presents the function used to conduct this analysis.


# To use rate_matrix, a data frame df must be supplied and two column names from df must be known. The data frame must contain a numeric binary class feature y.
# If any of the characteristics are numeric on a continuous scale, a cut must be specified to place the values into categorical ranges or buckets.

rate_matrix <- function(df, y, c1 = NA, c2 = NA, cut = 10, avg = TRUE) {

# If y is not a binary integer, then stop the function.
if (is.integer(df[[y]]) != TRUE) { stop("Please ensure y is a binary class integer.") }

df_col_names <- colnames(df)

# If c1 and c2 are not available
if (is.na(c1) & is.na(c2)) { stop("Please recall function with a c1 and/or c2 value.") }

# If only c1 is provided
else if (is.na(c2)) {

if (is.integer(df[[c1]])) {
var1 <- as.character(df[[c1]])
var1 <- unique(var1)
var1 <- as.numeric(var1)
var1 <- sort(var1, decreasing = FALSE) }

else if (is.numeric(df[[c1]])) {
var1 <- cut(df[[c1]], cut)
df[[c1]] <- var1
var1 <- levels(var1) }

else {
var1 <- df[[c1]]
var1 <- as.character(var1)
var1 <- unique(var1)
var1 <- sort(var1, decreasing = FALSE) }

c1_pos <- which(df_col_names == c1) # Number of column of characteristic c1

var1_len <- length(var1)

m <- matrix(NA, nrow = var1_len, ncol = 1)

rownames(m) <- var1
colnames(m) <- c1

for (i in 1:var1_len) {
bad <- df[,1][which(df[,c1_pos] == var1[i] & df[[y]] == 1)]
bad_count <- length(bad)

good <- df[,1][which(df[,c1_pos] == var1[i] & df[[y]] == 0)]
good_count <- length(good)

m[i,1] <- round(bad_count / (bad_count + good_count), 2) } }

# If c1 and c2 are provided
else {
if (is.integer(df[[c1]])) {
var1 <- as.character(df[[c1]])
var1 <- unique(var1)
var1 <- as.numeric(var1)
var1 <- sort(var1, decreasing = FALSE) }

else if (is.numeric(df[[c1]])) {
var1 <- cut(df[[c1]], cut)
df[[c1]] <- var1
var1 <- levels(var1) }

else {
var1 <- df[[c1]]
var1 <- as.character(var1)
var1 <- unique(var1)
var1 <- sort(var1, decreasing = FALSE) }

if (is.integer(df[[c2]])) {
var2 <- as.character(df[[c2]])
var2 <- unique(var2)
var2 <- as.numeric(var2)
var2 <- sort(var2, decreasing = FALSE) }

else if (is.numeric(df[[c2]])) {
var2 <- cut(df[[c2]], cut)
df[[c2]] <- var2
var2 <- levels(var2) }

else {
var2 <- df[[c2]]
var2 <- as.character(var2)
var2 <- unique(var2)
var2 <- sort(var2, decreasing = FALSE) }

c1_pos <- which(df_col_names == c1) # Number of column of characteristic c1
c2_pos <- which(df_col_names == c2) # Number of column of characteristic c2

var1_len <- length(var1)
var2_len <- length(var2)

m <- matrix(NA, nrow = var1_len, ncol = var2_len)

rownames(m) <- var1
colnames(m) <- var2

class_1 <- max(df[[y]])
class_0 <- min(df[[y]])

for (i in 1:var1_len) {
for (j in 1:var2_len) {
bad <- df[,1][which(df[,c1_pos] == var1[i] & df[,c2_pos] == var2[j] & df[[y]] == class_1)]
bad_count <- length(bad)

good <- df[,1][which(df[,c1_pos] == var1[i] & df[,c2_pos] == var2[j] & df[[y]] == class_0)]
good_count <- length(good)
m[i,j] <- round(bad_count / (bad_count + good_count), 2) } } }

# Create class 1 matrix report that includes averages
if (avg == TRUE) {
ColumnAverage <- apply(m, 2, mean, na.rm = TRUE)
ColumnAverage <- round(ColumnAverage, 2)
RowAverage <- apply(m, 1, mean, na.rm = TRUE)
RowAverage <- round(RowAverage, 2)
RowAverage <- c(RowAverage, NA)
m <- rbind(m, ColumnAverage)
m <- cbind(m, RowAverage)
return(m) }
else {
return(m) }

}

Employee Turnover Data Investigation

To begin this data investigation, I use the assumption that I have gained significant amounts of experience and field knowledge within Human Resources. I begin this heuristic analysis with the thought that employee turnover is greatly affected by how an employee feels about their job and about the company.

Are employees with small satisfaction levels more likely to leave?

The first thing I would like to confirm is that employees with small satisfaction levels are more likely to leave.


satisfaction <- rate_matrix(df = hr_data, y = "left", c1 = "satisfaction_level", cut = 20, avg = TRUE)

View(satisfaction)

satisfaction_level_single

The function call here uses a cut value of 20 with no particular reason. I want a large enough cut value to provide evidence of my claim.

As seen in the matrix, satisfaction levels between 0.0891 and 0.136 shows that 92% of employees categorized in this range will leave. This provides evidence that low satisfaction levels among employees are at highest risk of leaving the company.

As we would expect, the highest levels of satisfaction of 0.954 to 1 experience 0% employee turnover.

For simplicity and ease of understanding, I define 0.5 as the average satisfaction level. By taking a look at below average satisfaction levels between 0.363 to 0.408 and 0.408 to 0.454, there is an odd significant increase to the risk of employees leaving. This particular area of employee satisfaction requires more investigation because it goes against intuition.

Are employees with below average satisfaction levels more likely to leave across different job functions?

To alleviate this concern of odd satisfaction levels defying our intuition, I continue the investigation by seeing whether satisfaction levels vary across other characteristics from the data. It is likely possible that these below average satisfaction levels are tied to their job function.


satisfaction_salary <- rate_matrix(df = hr_data, y = "left", c1 = "satisfaction_level", c2 = "sales", cut = 20, avg = TRUE)

View(satisfaction_salary)

satisfaction_sales

Here, the same ranges of 0.363 to 0.408 and 0.408 to 0.454 satisfaction levels are generally at high risk to leave even across all job functions. There is evidence to suggest that somewhat unhappy workers are willing to leave regardless of their job function.

Is an unhappy employee’s likelihood of leaving related to average monthly hours worked?

To continue answering why below average satisfaction levels ranges experience higher employee turnover than we expect, I take a look at the relationship between satisfaction levels and average monthly hours worked. It could be that below average satisfaction levels at this company are tied to employees being overworked.


# First, convert the integer variable average_montly_hours into a numeric variable to take advantage of the function's ability to breakdown numeric variables into ranges.

average_montly_hours <- hr_data["average_montly_hours"]
average_montly_hours <- unlist(average_montly_hours)
average_montly_hours <- as.numeric(average_montly_hours)

hr_data["average_montly_hours"] <- average_montly_hours

satisfaction_avghours <- rate_matrix(df = hr_data, y = "left", c1 = "satisfaction_level", c2 = "average_montly_hours", cut = 20, avg = TRUE)

View(satisfaction_avghours)

satisfaction_hours

To reiterate, the row ranges represent the satisfaction levels and the column ranges represent the average monthly hours worked. Here, there is strong evidence to suggest that employees within the below average satisfaction level range of 0.363 to 0.408 and 0.408 to 0.454 work between 117 to 160 hours a month.

Using domain knowledge, typically, a full-time employee will work at least 160 hours a month, given that a full-time position merits 40 hours a week for 4 weeks in any given month. The data suggests here that we have a higher probability of workers leaving given they work less than a regular full-time employee! This was different from my initial train of thought that the employees were potentially overworked.

Given this finding, I come to one particular conclusion: employees with highest risk of leaving are those that are on contract, seasonal employees, or are part-time employees.

By considering other variables such as the number of projects worked on by an employee, it is possible to further support this conclusion.


satisfaction_projects <- rate_matrix(df = hr_data, y = "left", c1 = "satisfaction_level", c2 = "number_project", cut = 20, avg = TRUE)

View(satisfaction_projects)

satisfaction_projects

Here, it is evident to see that the below average satisfaction levels of 0.363 to 0.408 and 0.408 to 0.454 may in fact correspond to contract or part-time employees as the probability of turnover sharply decreases after 2 projects completed.

Are contract, part-time or seasonal employees more likely to be unhappy if the job is accident-prone?

Now that we identified the high risk groups of employee turnover within this data set, this question comes to mind because we would like to address the fact that an employee’s enjoyment in their role should be tied to their satisfaction levels. It could be that these part-time employees are experiencing hardships during their time at work, thereby contributing to their risk of leaving.

To answer this question, I take a look at the satisfaction level and number of projects completed given that an employee experienced a workplace accident.


# I use the package dplyr in order to filter the hr_data dataframe to only include observations that experienced a workplace accident
require(dplyr)

accident_obs <- filter(hr_data, Work_accident == 1)

satisfaction_accident <- rate_matrix(df = accident_obs, y = "left", c1 = "satisfaction_level", c2 = "number_project", cut = 20, avg = TRUE)

View(satisfaction_accident)

satisfaction_accident

Here, given the below average satisfaction levels of 0.363 to 0.408 and 0.408 to 0.454 for number of projects equal to 2 and given that employees experienced a workplace accident, there is evidence to suggest that there is a higher chance of turnover.

Further Work

The purpose of this analysis was to apply a risk segmenting method on human resources analytics data to identify potential reasons for employee turnover. I used probabilities or turnover rates to help identify some groups of employees that were at risk of leaving the company.

I found that there were higher chances of turnover given the employee had an extremely low satisfaction level, but also discovered that the type of employee (contract, part-time, seasonal) could be identified as groups of high risk of turnover. I addressed a possible fact that the likelihood of unhappiness for part-time employees  was attributed to them working on jobs that were accident-prone.

With the example presented in this post, Human Resources can use this information to put more efforts into ensuring contract, part-time, or seasonal employees experience lower turnover rates. This analysis allowed us to identify which groups of employees are at risk and allowed us to identify potential causes.

This risk analysis approach can be applied to any other field of practice other than Human Resources, including Health and Finance. It is useful to be able to come up with quick generic risk segments within your population so that further risk management solutions can be implemented for specific problems at hand.

Lastly, this post only provides a simple way to segment and analyze risk groups but it is not the only way! More advanced methods such as clustering and decision trees can help identify risk groups more thoroughly and informatively to provide an even bigger picture. For quick checks to domain expertise in any particular field of practice, the rate function I present here can be sufficient enough in identifying risk groups.

Applied Learningβ€’Data Scienceβ€’Microsoft SQL Serverβ€’Pythonβ€’Social Media

Extract, Transform, and Load Yelp Data using Python and Microsoft SQL Server

In this post, I will demonstrate a simple ETL process of Yelp data by calling the Yelp API in Python, and transforming and loading the data from Python into a Microsoft SQL Server database. This process is exemplary for any data science project that requires raw data to be extracted and stored to be consumed by other applications or used for further analysis.

Before we begin, the steps to this ETL process assumes the following four things:

  1. We have a verified and registered Yelp account.
  2. We have Microsoft SQL Server and SQL Server Management Studio installed. This guide can help us install both Microsoft SQL Server 2014 Express and SQL Server 2014 Management Studio.
  3. We have Python and an IDE installed. This guide can help us install Anaconda which installs Python 3.6 and the Spyder IDE.
  4. pyodbc module is installed after installation of Anaconda. Using the Anaconda Prompt, refer to these instructions to install pyodbc.
  5. We have a valid connection between Microsoft SQL Server and other local systems through the ODBC Data Source Administrator tool. Follow these simple steps to set up this connection.

EXTRACTION

To extract the raw Yelp data, we must make an API call to Yelp’s repositories.

Obtain App ID and App Secret

First, we go to the Yelp Developer page and scroll to the bottom and click ‘Get Started’.

Get_Started

Next we click on ‘Manage App’ in the left menu bar and record our App ID and App secret.  I whited-out the App ID below but you would see some form of text there. We will be needing these values in order to call the API within Python.

My_App

Run Yelp API Python Script

Next, using the App ID and App Secret, we run the following Python script which calls the Yelp API. In this example, I will be requesting business data for Kiku Sushi, a sushi restaurant that I have ordered from a few times.


# We import the requests module which allows us to make the API call
import requests

# Replace [app_id] with the App ID and [app_secret] with the App Secret
app_id = '[app_id]'
app_secret = '[app_secret]'
data = {'grant_type': 'client_credentials',
'client_id': app_id,
'client_secret': app_secret}
token = requests.post('https://api.yelp.com/oauth2/token', data = data)
access_token = token.json()['access_token']
headers = {'Authorization': 'bearer %s' % access_token}

# Call Yelp API to pull business data for Kiku Sushi
biz_id = 'kiku-sushi-burnaby'
url = 'https://api.yelp.com/v3/businesses/%s' % biz_id
response = requests.get(url = url, headers = headers)
response_data = response.json()

A successful API call will return the data in JSON format which is read by Python as a dictionary object.

Response_Data

Response_Data_2

Notice how the url variable within the script is a string whose value depends on the Yelp API documentation provided specifically for requesting business data.

Yelp_Biz_Request

The Request section in the documentation tells you the appropriate url to use. The Yelp API documentation provides a brief overview of the data points and data types received from the API call. The different data points and their respective data types is important to know when we load the data to the Microsoft SQL Server database later on.

Accessing the Dictionary

Using the documentation, we can extract a few data points of interest by accessing the dictionary as you normally would using Python syntax. The following lines of code will provide examples of some data extractions.


# Extract the business ID, name, price, rating and address

biz_id = response_data['id']
biz_name = response_data['name']
price = response_data['price']
rating = response_data['rating']
review_count = response_data['review_count']
location = response_data['location']
address = location['display_address']
street = address[0]
city_prov_pc = address[1]
country = address[2]

At this point, the extraction of the data is complete and we move onto transforming the data for proper storage into Microsoft SQL Server.

TRANSFORMATION

To transform the extracted data points, we simply reassign the data types. If we do not complete this step, we will run into data type conversion issues when storing it within Microsoft SQL Server.

The following code simply reassigns the data types to the extracted data points that we would like to store.


# Reassign data types to extracted data points
biz_id = str(biz_id)
biz_name = str(biz_name)
price = str(price)
rating = float(rating)
review_count = int(review_count)
street = str(street)
city_prov_pc = str(city_prov_pc)
country = str(country)

After the transformations are complete, we move into the final stage of loading the data into Microsoft SQL Server.

LOADING

In order to load a database such as those in Microsoft SQL Server, we need to ensure that we have a database created with the appropriate columns fields and column types.

Microsoft SQL Server Table Creation

After we log into our default database engine in SQL Server Management Studio, we set up and run the following T-SQL code.


-- Note that the number assigned to each varchar represents the number of characters that the data point can take up

CREATE TABLE Yelp (id varchar(50), name varchar(50), price varchar(5), rating float, review_count int, street varchar(50), city_prov_pc varchar(50), country varchar(50))

This effectively creates a table with the appropriate data types that allows us to store the Yelp data we extracted and transformed.


SELECT * FROM YELP

When we run the T-SQL code, we should see an empty table. This verifies successful table creation.

Empty_Table

Transferring Data from Python to Microsoft SQL Server

The last step is to run a Python script that takes the data points and saves them into Microsoft SQL Server. We run the following Python code to accomplish this task.


# We import the pyodbc module which gives us the ability and functionality to transfer data straight into Microsoft SQL Server
import pyodbc

# Connect to the appropriate database by replacing [datasource_name] with the data source name as set up through the ODBC Data Source Administrator and by replacing [database] with the database name within SQL Server Management Studio
datasource_name = '[datasource_name]'
database = '[database_name]'
connection_string = 'dsn=%s; database=%s' % (datasource_name, database)
connection = pyodbc.connect(connection_string)

# After a connection is established, we write out the data storage commands to send to Microsoft SQL Server
cursor = connection.cursor()

cursor.execute('INSERT INTO YELP (id, name, price, rating, review_count, street, city_prov_pc, country) values (?, ?, ?, ?, ?, ?, ?, ?)', biz_id, biz_name, price, rating, review_count, street, city_prov_pc, country)

cursor.commit()

After this script is run, we can do a final check that the data has been successfully loaded onto the Microsoft SQL Server database by rerunning a Yelp table query. Once we do, we see that we have in fact successfully transferred the data over.

SQL_Table

FURTHER WORK

This simple ETL process for Yelp data demonstrated the ability to tap into Yelp’s data repository using Python, simple data type considerations and loading data into Microsoft SQL Server.

One thing to note here is that we did not consider the more difficult data points to extract. For example, the Yelp API provides a data point corresponding to a restaurant’s operational hours which is stored as a dictionary within a list within a dictionary. Although not too difficult to extract, these kinds of data points do require more work.

Secondly, we should note that some data points are not always readily available because restaurant owners choose not to fill out this information. Also, as documented by Yelp, there will be no data available from an API call if the restaurant does not have any reviews (even if it is clear that they have a Yelp page)! We would have to account for the potential errors from the inability to extract specific information. For example, we could set up try-catch blocks in the Python code and have Microsoft SQL Server store NULL values.

Another thing to note is that there are security and efficiency considerations for loading data into a database. This exercise did not consider database creation design, where it is almost always efficient to have row keys and essential to minimize the data type memory space. It also did not demonstrate access to a secure database (where a username and password is required).

Although it is obvious that there is more that can be done, this post depicts the endless possibilities of how we may choose to further consider this data. Now that this data is stored in a nice tabular format within Microsoft SQL Server, we can use it for further analysis or other purposes within our data science projects. Further work can be done to automate the data extraction process, and set up more advanced SQL tables. Finally, There are a wide variety of social media API’s out there to try out and master.

Data Scienceβ€’Soft Skills

5 Lessons in Applied Data Science from Alternative Business Lending

After being part of a fast-growing financial company for about a year at Merchant Advance Capital, I have come to accept the limitations when wanting to eagerly dive into data that is unique to the industry. Initially, it was frustrating to see that so many modelling practices and standards learned throughout my education could not simply be followed within the alternative business lending industry. As I slowly started to peel back at what I knew, and begun to open myself up more to things that I did not know in practice, I soon noticed that I needed to conform my attitude and skills towards what the company really needed from my role. I want to talk a little bit about what I have learned thus far and hope to reflect on these lessons so that they may help me push forward in becoming a better Data Scientist.

1. Applying data science is pointless if you don’t know the data you’re working with and how it relates to your problem at hand.

The bulk of Merchant Advance Capital’s alternative lending practices is providing loans to subprime businesses within Canada. Many of these businesses lack the collateral to successfully obtain loans from a bank or are considering quick and cheap alternatives for their business needs. One important thing to note here is that building models to predict risk levels of different businesses requires knowing exactly what kinds of businesses you are lending to. It is super easy and sometimes tempting to gather a bunch of business characteristics and immediately send them through a machine learning algorithm to obtain predictions. It is always better to carefully choose, craft and analyze these characteristics and ensure that the relationships drawn from them make intuitive business sense. Domain expertise is very crucial.

2. Refrain from using machine learning algorithms where you cannot fully interpret the relationship between business characteristics and your model predictions.

I had to learn this the hard way when several reporting issues came about through different avenues. One such avenue was within the operations department, where loan application administrators had a difficult time translating machine learning predictive outcomes to business owners and their respective sales representatives. As a result, there began to be a lack of trust within the scorecard regime. In the event that a loan is a rejected, these respective parties deserve a fair reason as to why they have been declined. If you were to build risk scorecards using black-box methods, more often than not, your predictions will be very hard to interpret from a characteristic-to-characteristic level. It would also be difficult to explain why a business owner scored a certain way if a sales representative demanded specific reason for decline.

3. Refrain from using machine learning algorithms where you cannot fully understand the costs and benefits of your model predictions.

When first developing a risk scorecard, little did I know how significantly involved its use would be within the core business of the company. The predictions of your machine learning model can translate into restrictions on product pricing and the promotion of certain products to different segmented populations. It is so important that the characteristics used to describe and understand your target population are quantifiable and make intuitive business sense. It could so happen that these characteristics will be a unique aspect of your customer base that generates the most money or generates the most loss.

4. There must be a balance between the implementation of machine learning algorithms and the use of them at the operational level.

One of the biggest hypes in data science is the ability to utilize, understand and process big data in a matter of minutes. Applied data scientists often face challenges that are operation-specific such as lack of data automation, collection and organization. In a subprime lending industry where the bulk of our customer base are somewhat technologically adverse, the simplest solution for loan applications is through e-mail and paper submissions.

With huge technological inefficiencies as a restriction on the data pipeline, I often run into a give-and-take situation with respect to predictive modelling and process automation. Sometimes efficiency is accomplished by not including every business characteristic in the model because it either cannot be automated, its availability is costly or it is simply untrustworthy. I often run into unfavourable validation statistics that could have easily been solved with the provision of more uncorrelated predictive features, but the data collection is inefficient and expensive.

Sometimes predictive prowess and operational efficiency have to go hand in hand. Of course, short-term downfalls such as these can slowly be overcome as operational changes improve, technological capabilities are enhanced, and further research is done to understand which data points are worth collecting.

5. The Financial Industry is well-known for its standard modelling practices and conservatism. Sometimes, it is more beneficial to use these practices as benchmarks and gain flexibility using alternative underwriting practices.

It is important to know what kinds of data are unique to the company and what would not typically be looked at by major financial institutions. With the uproar of social media presence among today’s businesses, bad online reviews, nicely composed websites or product images can make or break the decision to receive financing. In cases like these, data science can immensely enhance the power of underwriting applications. The utilization of social media text analytics, geo-locational analysis, and the human experience can trump the analysis of a few financial ratios that financial institutions would normally be restricted to using.

Page 5 of 5« First...«2345

Meet Artemio!

Artemio is a Torontonian-at-heart living in Vancouver, BC. You can find him in and around the city sipping bubble tea and playing Pokemon GO.

Welcome!

You will find blog posts written about a passion for data science, travel, and the joys of life.

Follow Me!

Subscribe Here!

Instagram Feed

artemiorimando

Achoo-choo πŸ‡¨πŸ‡¦πŸš‚ #covidtravel2020 #wearamas Achoo-choo πŸ‡¨πŸ‡¦πŸš‚ #covidtravel2020 #wearamask #choochootrain #revelstoke #beautifulbc🍁
Stay golden πŸŒ„ #covidtravel2020 #albertaviews #l Stay golden πŸŒ„ #covidtravel2020 #albertaviews #lakeannette
Out here capturing a summer take of a similar phot Out here capturing a summer take of a similar photo I took in the winter last year πŸ™ˆπŸ€“ #covidtravel2020 #albertaviews #lakelouisecanada
Crystal clear 😌 #covidtravel2020 #albertaviews Crystal clear 😌 #covidtravel2020 #albertaviews #morainelake #luckyaf
Looking like sound waves but all I hear is quiet πŸ˜ŒπŸŒ„ #covidtravel2020 #albertaviews #pyramidlake #tranquil
Early riser πŸŒ„πŸ‡¨πŸ‡¦ #covidtravel2020 #jasperp Early riser πŸŒ„πŸ‡¨πŸ‡¦ #covidtravel2020 #jasperprovincialpark #albertaviews
Feeling lucky we saw the highest point of the Cana Feeling lucky we saw the highest point of the Canadian Rockies today πŸ‡¨πŸ‡¦πŸžοΈπŸš‘ #covidtravel2020 #albertaviews #luckyaf
Streaming Game 6 and walking this trail πŸ˜πŸ¦–πŸ‡¨πŸ‡¦ #winwin #covidtravel2020 #beautifulbc🍁 #kamloops
As of late, caused unnecessary game drama/stress w As of late, caused unnecessary game drama/stress with friends and family 😏, (ironically) advocated for destigmatizing mental health πŸ™, longest streak for not leaving the house was like 20 days, drank a lot of bubble tea, and a picture under a bridge to show for it. 2020 has been wild so far. #pandemic #staysafe
Smoggy sunrise πŸ‡ΉπŸ‡­ #thailand #bangkok #infini Smoggy sunrise πŸ‡ΉπŸ‡­ #thailand #bangkok #infinitypool #sunrise #gaytravel #gaypassport #travelasia #instatravel #travelpics #travelgram #travel #globetrotter #igtravel #igtravelworld
Wet trunks, sandy toes, sun block, speedy boats πŸ‡ΉπŸ‡­ #thailand #phuket #kohphiphi #phiphi #phiphiislands #paradise #gaytravel #gaypassport #travelasia #instatravel #travelpics #travelgram #travel #globetrotter #igtravel #igtravelworld
Ayy OkayπŸ‘ŒπŸ•ΆοΈπŸ‡ΉπŸ‡­ #thailand #phuket #koh Ayy OkayπŸ‘ŒπŸ•ΆοΈπŸ‡ΉπŸ‡­ #thailand #phuket #kohphiphi #phiphi #phiphiislands #gaytravel #gaypassport #travelasia #instatravel #travelpics #travelgram #travel #globetrotter #igtravel #igtravelworld
🀳 Big Buddha πŸ‡ΉπŸ‡­ #thailand #phuket #bigbud 🀳 Big Buddha πŸ‡ΉπŸ‡­ #thailand #phuket #bigbuddha #gaytravel #gaypassport #travelasia #instatravel #travelpics #travelgram #travel #globetrotter #igtravel #igtravelworld #blackandwhite
πŸ‡»πŸ‡³ Long 🐲 Lan 🦁 Quy 🐒 PhungπŸ₯ #v πŸ‡»πŸ‡³ Long 🐲 Lan 🦁 Quy 🐒 PhungπŸ₯

#vietnam #haolu #ancientcapital #ancient #gaysian #gaytravel #gaypassport #instagay #travelasia #instatravel #travelpics #travelgram #travel #globetrotter #igtravel #igtravelworld #blackandwhite
Ha Long Bae πŸš£β€β™‚οΈπŸ‡»πŸ‡³ #vietnam #halong Ha Long Bae πŸš£β€β™‚οΈπŸ‡»πŸ‡³ #vietnam #halongbay #baitulongbay #gaysian #gaytravel #gaypassport #instagay #travelasia #instatravel #travelpics #travelgram #travel #globetrotter #igtravel #igtravelworld #blackandwhite
Load More... Follow on Instagram

Most Popular Posts

  • Data Engineering using Airflow with Amazon S3, Snowflake and Slack
    Data Engineering using Airflow with Amazon S3, Snowflake and Slack
  • Extract, Transform, and Load Yelp Data using Python and Microsoft SQL Server
    Extract, Transform, and Load Yelp Data using Python and Microsoft SQL Server

Links to the Past

artemiorimando

Achoo-choo πŸ‡¨πŸ‡¦πŸš‚ #covidtravel2020 #wearamas Achoo-choo πŸ‡¨πŸ‡¦πŸš‚ #covidtravel2020 #wearamask #choochootrain #revelstoke #beautifulbc🍁
Stay golden πŸŒ„ #covidtravel2020 #albertaviews #l Stay golden πŸŒ„ #covidtravel2020 #albertaviews #lakeannette
Out here capturing a summer take of a similar phot Out here capturing a summer take of a similar photo I took in the winter last year πŸ™ˆπŸ€“ #covidtravel2020 #albertaviews #lakelouisecanada
Crystal clear 😌 #covidtravel2020 #albertaviews Crystal clear 😌 #covidtravel2020 #albertaviews #morainelake #luckyaf
Looking like sound waves but all I hear is quiet πŸ˜ŒπŸŒ„ #covidtravel2020 #albertaviews #pyramidlake #tranquil
Early riser πŸŒ„πŸ‡¨πŸ‡¦ #covidtravel2020 #jasperp Early riser πŸŒ„πŸ‡¨πŸ‡¦ #covidtravel2020 #jasperprovincialpark #albertaviews
Follow on Instagram
This error message is only visible to WordPress admins
Error: There is no connected account for the user 31859063.

Subscribe for new updates!

Β© 2019 ARTEMIO RIMANDO // All rights reserved.