Tv show popularity analysis using data mining

Abstract

Television group of onlookers rating is a vital pointer as to prevalence of projects and it is likewise a factor to impact the income of communicate stations through promotions. Albeit higher evaluations for a given program are gainful for the two supporters and promoters, little is thought about the components that make programs increasingly alluring to watchers. So as to think about the prevalence of performers, we consider the quantity of hits gotten by the tweets identified with them on Twitter. In this project we are using three different data mining techniques namely – Decision Tree, Naïve Bayes, and XGBoost. We are comparing each data model with other techniques so that we get the most accurate results. The overall objective of our work is to predict more accurately , which tv show will gain more popularity in the future. Here, we have the option to develop a Graphical User Interface(GUI) that may assist any naïve user in evaluating a show and predict it’s success.

Existing System

? The new data sets selected in the middle of the process are added into the existing database. ? In the existing database the data can be handled very effectively by using this clustering. ? The new data is inserted into the databases in the middle of the process directly into the existing clusters. ? With the help of incremental clustering the data is directly inserted into the existing clusters. ? But when we used incremental K-Means then it can’t be re-run for the whole database but only re-run for the outside points which are not put in the existing clusters.

Disadvantages

? It is used for classification problems, mainly used for text classification involving high dimensional training data sets. ? It is used to find optimal solution to a linear regression problem. It involves “loss function”, ”weak learner”, ”additive model”. ? A predictive model is required to be built which works as a rating system that’ll be useful for people who are willing to watch a particular show. ? They will also be able to get feedback from previous viewers and provide help to new viewers. ? Vast databases from social media sites will be used for the same.

Proposed System

• The data mining approaches are suggested by Ester et al proposes Incremental DB Scan which is suitable for mining. • Based on the clusters and techniques many theories have been proposed in by them. • Therefore reducing computation time and give better accuracy proposed algorithm is used. Incremental clustering is a generalized approach to perform clustering on database initially, later on after adding the new data the process starts from that particular point. • The proposed research aims to predict the TV show popularity rating The two algorithms used are K-Means and incremental K-Means for analysis of TV show popularity rating.

Advantages

? The algorithm checks whether which part of the space is affected by the new data. For the pair of objects this algorithm is very efficient one. ? The time and space complexity of these algorithms are very efficient when compared to typical k-means algorithms. ? Many researchers developed new operators to improve the efficiency and optimization purpose. ? The main merits are its simplicity, memory efficiency and speed which allows it to run large data sets. ? When new data is added in non-incremental results in the decrease in efficiency. This can be overcome by Incremental clustering which improves efficiency and helps in grouping the new data.

Download DOC Download PPT