Skip to main content
 

An accurate candidate-vetting is one of the many notable features that Crewscale brings to the table. Companies can now get an ideal fit for their job within seconds, with the candidate-matching algorithm in place. This article aims at exploring this deft algorithm that makes the search results credible.

The profile factors that govern the algorithm are –

  1. Skill proficiency
  2. Skill exposure freshness (i.e How recently has the candidate been exposed to the said skill)
  3. Work background
  4. Experience
Table of Contents

Introduction

Crewscale’s matching algorithm uses the Neo4j Graph database under the hood.

Unlike traditional databases, which arrange data in rows, columns and tables, Neo4j has a flexible structure defined by stored relationships between data records.

With Neo4j, each data record, or node, stores direct pointers to all the nodes it’s connected to. Because Neo4j is designed around this simple, yet powerful optimization, it performs queries with complex connections orders of magnitude faster, and with more depth, than other databases.


 

Implementation

The skills listed in a job requirement become an input to the matchmaking algorithm.

Step 1

Within the pool of candidates, a thorough search kicks off, which fetches all the candidates with matching skills. With this result, candidates are further bifurcated into two pools, based on whether or not they have completed their assessments, in our tech-assessment platform Talscale.

Furthermore, a euclidean-distance is calculated for all the candidates who have passed their assessments. This distance marks them based on how far they stand from the spectrum of an ideal score.

Step 2

While defining a job, the required skills are assigned weightage for their importance. The candidate score from Step 1 is then multiplied with the respective weight magnitude, to form an accurate match for a given job.

Step 3

From the list of work experiences that a candidate has had, the most recent skill exposure is searched, and an exposure value is assigned to the candidate.

Step 4

Candidate experience is accounted for along with the resultant score so far.

Step 5

Finally, an overall score is constructed with an amalgamation of all the factors calculated so far, and the list of candidates is sorted with respect to their overall score.

This algorithm ran on the staging environment and provided the search results within 24 seconds, hooray. Wait, what? Yes. So, let’s talk about optimization next.


 

Optimization

After the 24-second fiasco, the first thing I did was query-profiling in the Neo4j Desktop. I aimed to find the culprit that dragged down my algorithm to the drains.

Level 1

I saw that the case-insensitive query for skill matching among the candidates was taking an alarming amount of time.

 

To give an example, a part of the initial query was something like the following

WHERE (s.name =~ ‘(?i)react.*’ or s.name =~ ‘(?i)java.*’)

This query undoubtedly fetched all the relevant results disregarding upper or lower case, but with the price of time, which could not be compromised at any cost. So, I adopted an old-school fix to this problem, which is, to ensure and store every skill in its lowercase form in the database, and then modifying the search string to lowercase. This effectively reduced the execution time.

Level 2

Indexing

Indexing is a way to optimize the performance of a database by minimizing the number of disk accesses required when a query is processed. It is a data structure technique which is used to quickly locate and access the data in a database. Indexes are created using a few database columns.

Keeping this in mind, I learned that my query was fetching results based on the skill-name property, which was not indexed in the database yet. I referred to the types of indexing provided by Neo4j and chose the optimal one corresponding to my requirements.

 
Additionally, I analyzed the entire query, and added indexes for all those properties which were always a part of data retrieval.
 

This optimization levelled up the speed of the matchmaking algorithm to a huge extent. The execution time at this level was about 8 to 12 seconds, a remarkable improvement, but not enough.

Level 3

By this time, I had skimmed numerous articles on performance improvements, and the last bit that I found lacking in the matching algorithm was, the usage of subqueries.

Call {} Subquery in Neo4j

CALL allows to execute subqueries, i.e. queries inside of other queries. Subqueries allow you to compose queries, which is especially useful when working with UNION or aggregations.

A subquery is evaluated for each incoming input row and may produce an arbitrary number of output rows.
 
Every output row is then combined with the input row to build the result of the subquery. That means that a subquery will influence the number of rows.
 
Needless to say, the matching algorithm evaluates and filters humongously, which results in many aggregations in the query.
 
To my surprise, after including the aggregations into different subqueries, the execution time had magnificently improved! The final query execution took around 2 to 3 seconds then.
 

 

Conclusion

In retrospect, Neo4j indeed turned out to be well-suited for our matchmaking algorithm.

Neo4j connects data as it’s stored, enabling queries never before imagined, at speeds never thought possible.
 
Neo4j also provides full database characteristics, including ACID transaction compliance, cluster support, and runtime failover — making it suitable to use graphs for data in production scenarios.
 
With about a million nodes and relationships in-store, an efficient search for the top candidates based on umpteen factors and filters, with every little permutation & combination, all in about 1 to 3 seconds, is truly a marvel!

 

Mansi

About Mansi

Software Engineering Lead @ Crewscale

View all posts by Mansi

One Comment

Leave a Reply

Our website uses cookies, which helps us to deliver the best customer experience.Cookie policy.Got it

crewscale logo

Want to get updates to your mailbox? 📬

Subscribe for weekly dose of tech hiring news and updates!