Online or internet-based recruitment portals are becoming more popular as they are saving time for the employer as well as for the job seeker. In recent years, most of the companies post their job description in various online platforms or gather huge number of resumes from various recruiting agencies. Manually processing these resumes and fitting them to many requirements is a herculean task. Due to increased amount of data it is really a big challenge to effectively analyze each resume based on various parameters. Recruitment agencies or talent acquisition (TA) team of a company gather, process and manage thousands of resumes. Automatic processing and analyzing these resumes will save time and money. Automatic grouping of resumes is really helpful to the hiring team and this also increases the searchability in a huge data. In this research study we have proposed a system that automatically extract information from the resume and segment resumes based on various parameters. These segmented resumes can be forwarded to the particular department based on the domain and skillset.
Also, this segmentation can be used for automatic mapping of skills and jobs. We have developed topic clustering model using Latent Semantic Indexing (LSI) and Latent Dirichlet allocation (LDA) techniques for clustering resumes using our own resume database. LDA is a topic model that generates topics based on word frequency from a set of documents. LDA is particularly useful for finding reasonably accurate mixtures of topics within a given document set. We have evaluated the performance of our resume segmentation using coherence. The coherence score is for assessing the quality of the learned topics. It has been considered for the performance evaluation of resume clustering using LSI and LDA for dataset. LDA model performs better when compared to LSI topic model. We have achieved 62.05% coherence score for LDA model