Semantics of Data Mining Services in Cloud Computing

Abstract : The recent incorporation of new Data Mining and Machine Learning services within Cloud Computing providers is empowering users with extremely comprehensive data analysis tools including all the advantages of this type of environment. Providers of Cloud Computing services for Data Mining publish the descriptions and definitions in many formats and often not compatible with other providers. From a functional point of view, having the possibility to describe complete Data Mining services is fundamental to maintain the usability and especially the portability of these services independently of the software/hardware support or even the differences between cloud platforms. The main objective of this paper is to design a Data Mining service definition which allows to compose with a single and simple definition a complete service, in such way a data mining workflow can be ported and deployed in different providers or even in a Market Place of this type of ready-to-consume services. This article presents a semantic scheme for the definition and description of complete Data Mining services considering both the management of the service by theprovider (price, authentication, Service Level Agreement, ...) and the definition of the Data Mining workflow as a service. It represents a solid contribution for paving the way to the standardization and industrialization of Data Mining services. To asses the validity of the scheme a list of services from Data Mining providers have been described and an example of a full service for a Random Forest algorithm has been defined as a service. In addition, a practical scenario has been developed, creating a deployment platform for Data Mining services to give functional support to the scheme, illustrating the practical benefits of the proposal for the end user.
 EXISTING SYSTEM :
 ? The most recent proposals try to unify different schemata and vocabularies previously developed following the Linked Data guidelines. ? With Linked Data, you can reuse vocabularies, schemata and concepts. This significantly enriches the definition of the schema, allowing you to create the model definition based on other existing schemata and vocabularies. ? Linked-USDL,MEX[core,algo,perf],ML-Schema, On to D Mor Expos ´e among others, can be considered when creating a workflow for DM/ML service definition using Linked Data.
 DISADVANTAGE :
 ? The proposals address the modeling and definition of services in a generic mode, without dealing with the specific details of a DM services in CC. ? DM work flows, we use services or definition of services in CC that allow us to compose a workflow, this problem is overcome: the description of these services is standardized and all the complexity of the execution is left to the CC platform. ? There are several proposals for the definition of services covering an important variety of both syntactic and seman-tic languages to achieve a correct definition and modeling of services. Solutions based on the proposal offered by Linked Data can solve the problem of defining services from a perspective more comprehensive .
 PROPOSED SYSTEM :
 • We propose dmcc-schema, a schema and a set of vocabularies which has been designed to address the problem of describing and defining DM/ML services in CC. ? Not only it focuses on solving the specific problem of modelling, with the definition of workflow and algorithms, but it also includes the main aspects of a CC service. ? dmcc-schema can be considered as a Linked Data proposal for DM/ML services. ? Existing Linked Data vocabularies have been integrated into dmcc-schema and new vocabularies have been created ad-hoc to cover certain aspects that are not implemented by other external schemata.
 ADVANTAGE :
 ? The most recent proposals try to unify different schemata and vocabularies previously developed following the LD guidelines. ? With LD, you can reuse vocabularies, schemata and concepts. This significantly enriches the definition of the schema, allowing you to create the model definition based on other existing schemata and vocabularies. ? Frameworks and libraries integrated into the most modern and efficient programming languages such as C,Java,Python,R,Scalaand others. ? These environments lack elements for CC services modeling due to itson-premise nature.

We have more than 145000 Documents , PPT and Research Papers

Have a question ?

Mail us : info@nibode.com