Premium Essay

Datamining

In: Computers and Technology

Submitted By ankush62
Words 874
Pages 4
What is data mining: * Data mining (knowledge discovery from data) * Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data * data processing using sophisticated data search capabilities and statistical algorithms to discover patterns and correlations in large preexisting databases; a way to discover new meaning in data.
2. KDD process * General functionality * Descriptive data mining * Predictive data mining * Different views lead to different classifications * Data view: Kinds of data to be mined * Knowledge view: Kinds of knowledge to be discovered * Method view: Kinds of techniques utilized * Application view: Kinds of applications adapted

Data mining issues * Mining methodology * Mining different kinds of knowledge from diverse data types, e.g., bio, stream, Web * Performance: efficiency, effectiveness, and scalability * Pattern evaluation: the interestingness problem * Incorporation of background knowledge * Handling noise and incomplete data * Parallel, distributed and incremental mining methods * Integration of the discovered knowledge with existing one: knowledge fusion * User interaction * Data mining query languages and ad-hoc mining * Expression and visualization of data mining results * Interactive mining of knowledge at multiple levels of abstraction * Applications and social impacts * Domain-specific data mining & invisible data mining * Protection of data security, integrity, and privacy
Why Data Mining?—Potential Applications

* Data analysis and decision support * Market analysis and management * Target marketing, customer relationship management (CRM), market basket…...

Similar Documents

Premium Essay

Data Mining

...effective way of predicting customer behavior and buying patterns. Measures need to be taken not only to overcome the stigma that data mining is unsecure and takes away personal freedom, but to make sure individual information is, in fact protected. If these measures are taken, data mining is a win-win for both businesses and consumers. Consumers will feel heard, understood, and taken care of. Businesses can actually focus resources on building that business-to-customer relationship and will be able to give the people what they need. References Advantages and disadvantages of data mining (2012). Retrieved May 30, 2012, from http://www.dataminingtechniques.net/data-mining-tutorial/advantages-and-disadvantages-of datamining/ Ali, R., Ghani, U., & Saeed, A. (n.d.) Data clustering and its applications. Retrieved May 30, 2012, from http://members.tripod.com/asim_saeed/paper.htm Association rule learning. (2012, May 30). In ). In Wikipedia, The Free Encyclopedia. Retrieved 13:46, May 30, 2012, from http://en.wikipedia.org/wiki/Association_rule_learning Data mining: issues. (n.d.) Retrieved May 30, 2012, from http://www.anderson.ucla.edu /faculty/jason.frand/teacher/technologies/palace/issues.htm Exforsys Inc. (2006). Data mining privacy concerns. Retrieved May 30, 2012, from http://www.exforsys.com/tutorials/data-mining/data-mining-privacy-concerns.html Li, X. & Sarkar, S. (2006) Privacy protection in data......

Words: 1900 - Pages: 8

Premium Essay

Study Guide

...Enterprise information systems focus on data warehouse Important terms and concepts i. Definitions and purposes of data warehouse ii. Definitions of data mart iii. Data ware data type iv. Metadata 4. Week 9 Data Communication a. Types of networks i. Pan/Lan/Can/Wan/Man ii. Bluetooth, WiFi, WiMax iii. Terms/concepts 1. Packet switching 2. 3. 4. 5. 6. 7. 8. 9. a. b. Internet protocol TCP/IP VOIP (definition and advantages/disadvantages) VPN (definition) Hotspots Access points Tunnel 5. Week 13 Business Intelligence Definitions and architecture of BI Analytical tools i. definitions ii. OLAP 1. Drill-down 2. Pivot tables & Pivoting 3. Slicing 4. Dicing iii. Data Mining 1. Definitions 2. Supervised data mining 3. Unsupervised datamining iv. Decision support systems 1. Decision types (unstructured/ structured/ semi-structured) 2. Comparison of DSS/MIS/TPS/EIS(executive information systems) 3. Model-oriented DSS 4. Data-oriented DSS 5. Sensitivity analysis Monitoring tools i. KPIs ii. Balanced scorecard iii. Digital Dashboards iv. scorecards c. ...

Words: 273 - Pages: 2

Free Essay

Persuasion Notes

...more exposure you have the more comfortable you are with it. READING#3 Morales, A.C. (2005). Giving firms an “E” for effort: This research shows that consumers reward firms for extra effort. The rewarding process is defined broadly as general reciprocity. When consumers infer that effort is motivated by persuasion, however, they no longer feel gratitude and do not reward high-effort firms. READING4: Mrs. Keech and prophecy: In the 1950's a cult that brainwashed people into leaving their homes, families, jobs, etc. to prepare to leave on a flying saucer. MOVIES: The Persuaders (Douglas Rushkoff, 2004)-how advertising companies are using Media, film, Commercials to win over consumers. How Political parties are doing this as well. …i.e DATAMINING- how there are rooms full of information on each person in the world. So that companies can know each person's habits, likes, dislikes, statistics etc. Wag the Dog (Barry Levinson, 1997- When the president is caught in a sex scandal 2weeks before re-election, PR crisis Management agent comes in to create a diversion----by creating a "fake war". He hopes that the media will focus on this instead of the current sex scandal. ...

Words: 1313 - Pages: 6

Premium Essay

A Study on the Systems Applications and Products (Sap) and Statistical Analysis System (Sas) Software

...Retrieved October 25, 2011 from http://itknowledgeexchange.techtarget.com/itanswers/sap-vs-sas/. Peterson. W. (1991). Advances in input-output analysis: technology, planning, and development. Oxford University Press. Raa, T. T. (2006). The Economics of input-output analysis. Cambridge University Press. SAS Institute Inc. (1976). SAS: the power to know. Retrieved October 25, 2011 from http://www.sas.com/software/data-management/surveyor-sap/index.html SAS ties up with Asia Pacific College. (2010). Retrieved October 25, 2011 from http://www.philstar.com/Article.aspx?articleId=634410&publicationSubCategoryId=71. SAS v/s SAP analytics. (2006). Retrieved October 25, 2011 from http://businessintelligence.ittoolbox.com/groups/vendor-selection/datamining-select/sas-vs-sap-analytics-1149112....

Words: 268 - Pages: 2

Premium Essay

It Consultation for Mr Green

...can afford. [pic] References: Ricardo, C (2012). Databases Illuminated, Jones & Barlett Learning, Sudbury, MA Unknown (2011). To five benefits of a data warehouse. Retrieved from: http://spotfire.tibco.com/blog/?p=7597 Harris, R. (2007). Google’s 650, 000 core warehouse. Retrieved from: http://www.zdnet.com/blog/storage/googles-650000-core-warehouse-size-computer/213 Harris, D. (013). Why Apple, eBay, and Wal-Mart have some of the biggest data warehouses you have ever seen. Retrieved from: http://gigaom.com/2013/03/27/why-apple-ebay-and-walmart-have-some-of-the-biggest-data-warehouses-youve-ever-seen/ Unknown (2013). Data warehouse Architecture. Retrieved from: http://datamining zone.weeebbly.com Unknown (2013). 10 tips to optimize data warehouse retrieved from: http://quaero.csgi.com/blog/457-10_tips_to_optimize_data_warehouse_reporting...

Words: 1106 - Pages: 5

Premium Essay

Managing Data Resources

... 250 | | |Too many data marts create complexity, costs, and management problems. | | | | | |Answer: True Difficulty: Medium Reference: p. 250 | | |Datamining helps companies engage in target marketing. | | | | | |Answer: True Difficulty: Medium Reference: p. 251 | | |Datamining poses challenges to the protection of individual privacy. | | | | | |Answer: True Difficulty: Easy Reference: p. 251 | | |Hypermedia databases store chunks of multimedia information in the form of nodes connected by links the......

Words: 4937 - Pages: 20

Premium Essay

Datamining

...Chapter 3 The Relational Model Review Questions 3.1 Discuss each of the following concepts in the context of the relational data model: (a) Relation (b) Attribute (c) Domain (d) Tuple (e) Intension and Extension (f) Degree and Cardinality. Each term defined in Section 3.2.1. 3.2 Describe the relationship between mathematical relations and relations in the relational data model? Let D1, D2, . . . , Dn be n sets. Their Cartesian product is defined as: D1  D2  . . .  Dn  {(d1, d2, . . . , dn) | d1 D1, d2 D2, . . . , dn Dn} Any set of n-tuples from this Cartesian product is a relation on the n sets. Now let A1, A2, . . ., An be attributes with domains D1, D2, . . . , Dn. Then the set {A1:D1, A2:D2, . . . , An:Dn} is a relation schema. A relation R defined by a relation schema S is a set of mappings from the attribute names to their corresponding domains. Thus, relation R is a set of n-tuples: (A1:d1, A2:d2, . . . , An:dn) such that d1 D1, d2 D2, . . . , dn Dn Each element in the n-tuple consists of an attribute and a value for that attribute. Discussed fully in Sections 3.2.2 and 3.2.3. 3.3 Describe the differences between a relation and a relation schema. What is a relational database schema? A relation schema is a named relation defined by a set of attribute and domain name pairs. A relational database schema is a set of relation schemas, each with a distinct name. Discussed in Section 3.2.3. 3.4 Discuss the......

Words: 3750 - Pages: 15

Premium Essay

Student

...Observe and generate information about the areas of the Freight and Logistics Pole. • • • • TECHNICAL SKILLS Industrial: • Quality Control / Management. • Production management and inventory • Logistics/Supply Chain Management. • Operations Research: Formulation, Simulation and resolution of industrial problems. • Statistics and Probability. • Project Management. Computer: • Platform: Windows, MAC OS. • Programming: Java - HTML-PHP - JavaScript - SQL - Visual Basic. • Modeling Techniques: UML. • Excellent mastery of: MS Office (Access, Excel, Word, PowerPoint) and iWork pack. • Software: Solidworks- ARENA "Simulation of Industrial Modules" - MySQL "Database" - LINDO "Resolutions industrial problems/ simplex Method"-MATLAB• Datamining: Association Rools using Weka Software. • Presentations: MS PowerPoint 2013, Prezi,Keynote • Statistical Analysis: StaPlus, MS Excel 2013, SAP, SPSS. Other Skills: • Presentations and Public Speaking • Mind Mapping Techniques and working methodologies • Projects and Team Management • Critical Thinking • Time Management • Judgment and Decision Making • Learning Strategies PROJECTS • Elaborated of the IT plan for Tree House café Start-up. • Construction and analysis of the performance of a catapult. • Creating and managing a database using SQL Enterprise Manager. • Construction and analysis of the performance of a conveyor. • Simulation of a conveyors Plant. • Realization of a production line of lamp candles. SEMINARS,......

Words: 667 - Pages: 3

Free Essay

Singularity Notes

...nations grid. Currently, the price sits as low as $0.70 cents per watt. Once we have molecular nanotechnology based manufacturing, can produce solar panels extremely ineffictively, basically at the costs of raw materials. Could eventually be as inexpensive as a penny per square meter. Could put solar panels everywhere, on buildings, majority of human surfaces. Could put solar satellite into space and beam to earth via microwave. Each unit could provide billions of watts of electricity. Medicine 213 New ECG analysis for long-term unobtrusive monitoring, detect early warning signs of heart disease. AI programs to do pattern recognition and intelligent data mining in development of new drug therapies. Intelligent datamining tools to find new ways to distupt metabolisms of pathogens. CPOE checks for every order for possible allergies in a patient, drug interactions, duplications, drug restrictions, guidelines, ect. Patter recognition applied to protein pattern patterns can better detect ovarian cancer. Augmentations 197 Billions or trillions of nanobots can be put into the bloodstreem. Use to scan the human brain to reverse engineer it. Blood cell replacements will perform hundreds or a thousand times better then their replacements. Microbivores will be more effective then white blood cells. Mend DNA transcription errors, and implement DNA changes. Other robots can serve as cleaners, remove unwanted debris and chemicals from cells. ...

Words: 399 - Pages: 2

Premium Essay

Crm Curriculum

...technologies which enable an organization to get a single unified view of its customers across various ‘points of contact’. A balanced approach of CRM combines both operational and the analytical technologies. Data warehouseing and data mining form the support base for both operational as well as analytical CRM. The role of datawarehousing and datamining tools and applications are highlighted to help appreciate the analytical aspects of CRM. Coverage includes the three main components of comprehensive CRM solutions include Campaign Management, Sales Force Automation, and Customer Service and Support. The functionalities and applications of a few popular CRM products targeted at large enterprises (Siebel, SAPCRM) and a few targeted at the small and medium enterprises (SalesLogix, Microsoft CRM) will be covered. An overview of the emerging hosted CRM products (Salesforce.com) and social CRM will also be covered. Finally the role of contact centers in building customer relationship is highlighted. Topics include  Sales Force Automation  Customer Service and Support  Marketing (Campaign Management)  Datawarehouse & datamining  Evaluating technological solutions for CRM  Role of a contact center in building relationships  Components of a contact center  Economics of a contact center Session 11-12: Components of eCRM Solutions Reading – Chapter 7 Case Analysis – Pilgrim Bank: Customer Profitability Session 13 : Product Offerings in the CRM......

Words: 2004 - Pages: 9

Premium Essay

Marketing Chapter 5

...must know their customers! Customer database = organised collection of comprehensive information about individual customers or prospects that is current, accessible and actionable for such marketing purposes as lead generation, lead qualification, sale of a product or service or maintenance of customer relationships. Database marketing = the process of building, maintaining and using customer databases and other databases for the purposes of contacting, transacting and building customer relationships. Customer databases Customer mailing list = simply a list of names and addresses and telephone numbers. Customer database contains far more information. Business database = business customers’ information Data Warehouses and Datamining Datamining = marketing statisticians can extract useful information about individuals, trends and segments from the mass of data. Using the database 1. To identify prospects 2. To decide which customers should receive a particular offer 3. To deepen customer loyalty 4. To reactivate customer purchases 5. To avoid serious customer mistakes The Downside of Database Marketing & CRM 4 problems can deter a firm from effectively using CRM: 1. Building and maintaining a customer database requires hardware, software, skilled personnel and it is difficult to collect the right data. 2. Difficulty getting everyone in the company to be customer-orientated and to use the available information 3. Not all customers want a relationship......

Words: 2074 - Pages: 9

Premium Essay

Rohan

...which enable an organization to get a single unified view of its customers across various ‘points of contact’. A balanced approach of CRM combines both operational and the analytical technologies. Data warehouseing and data mining form the support base for both operational as well as analytical CRM. The role of datawarehousing and datamining tools and applications are highlighted to help appreciate the analytical aspects of CRM. Coverage includes the three main components of comprehensive CRM solutions include Campaign Management, Sales Force Automation, and Customer Service and Support. The functionalities and applications of a few popular CRM products targeted at large enterprises (Siebel, MySAP, PeopleSoft, Oracle) and a few targeted at the small and medium enterprises (SalesLogix, Talisma, Microsoft CRM, Onyx, SalesNotes) will be covered. An overview of the emerging hosted CRM products is also provided. Finally the role of contact centers in building customer relationship is highlighted. Topics include – Sales Force Automation – Customer Service and Support – Marketing (Campaign Management) – Datawarehouse & datamining – Evaluating technological solutions for CRM – Role of a contact center in building relationships – Components of a contact center – Economics of a contact center Session 11: Components of eCRM Solutions Reading – Chapter 7 Session 12: Case Analysis – Pilgrim Bank : Customer......

Words: 1490 - Pages: 6

Premium Essay

Database 6th Ch1 and Ch2

...users? Discuss the main activities of each. Answer: access to the database for querying, updating, and generating reports; the database primarily exists for their use 1.6. Discuss the capabilities that should be provided by a DBMS. Answer: efficiently executing queries and updates 1.7. Discuss the differences between database systems and information retrieval systems. Answer: there is a need to apply many of the IR techniques to processing data on the Web. Data on Web pages typically contains images, text, and objects that are active and change dynamically.   Exercises 1.8. Identify some informal queries and update operations that you would expect to apply to the database shown in Figure 1.2. Answer  Insert into COURSE value(‘Datamining’, ‘CS3390’, 3, ‘CS’) Chapter2 Database System Concepts and Architecture Review Questions 2.1. Define the following terms: data model, database schema, database state, internal schema, conceptual schema, external schema, data independence, DDL, DML, SDL, VDL, query language, host language, data sublanguage, database utility, catalog, client/server architecture, three-tier architecture, and n-tier architecture. Answer: Data model: a collection of concepts that can be used to describe the structure of a database Database schema: The description of a database Database state: The data in the database at a particular moment in time Internal schema: describes the physical storage structure of the database Conceptual......

Words: 1273 - Pages: 6

Premium Essay

Datamining

...MSc. Information System Management Kyaw Khine Soe (3026039) Data Mining and Business Analytics Boston Housing Dataset Analysis. Table of Contents Introduction 3 Problem Statement 3 The associated data of Boston 5 Data pre-processing / Data preparation 8 Clustering Analysis 11 Cluster segment profile 17 Regression Analysis 18 Predictive analysis using neural network node 19 Decision tree node 21 Regression node analysis 23 Model Comparison 24 The recommendation and conclusion 26 Bibliography 27 Introduction This report included part of assignment for the Data Mining and Business Analytics. This report based on the Boston Housing Dataset to describe prediction, cluster analysis, neural networks and decision tree nodes. Boston Housing is a real estate related dataset from Boston Massachusetts. This is small dataset with 506 rows can show prediction of housing price and regressing using decision trees and neural networks over this dataset. This report shows analysis of the property price over the size, age of property, environment factor such as crime rate, near the river dummy, distanced to employment centers and pollution. Problem Statement In relation to housing intelligence, real estate are usually concerned with following common business concerns: 1. Which area are high rates of crime? How crimes rates effected on housing price? How can reduce the crime? 2. Which area is most/lease house price base on rooms in house/ area......

Words: 2101 - Pages: 9

Premium Essay

Essentials

...Essentials Session X Information Resources Management 11 Four types of Information Systems • • • • • Operational Decision Support Managerial Executive Decision-making becomes more complex the more executive the level • Operational systems have been around a long time and tend to have good ROI’s MBA Essentials Session X Information Resources Management 12 MBA Essentials Session X Information Resources Management 13 Current Technologies for Strategic Information Systems What are the latest technologies of interest? • CPU’s and software, open source code • Client server computing • Interactive multimedia • Developments in Electronic Commerce • TCP/IP and the Internet • Databases and Datamining • Handhelds, M-commerce • Knowledge Management tools and Artificial Intelligence MBA Essentials Session X Information Resources Management 15 MBA Essentials Session X Information Resources Management 16 MBA Essentials Session X Information Resources Management 17 Technologies: CPU’s and Software • • • • • Hardware components of a computer system Buses, CPUs, MHz, RAM, Gigs and cache Bits and Bytes, storage Moore’s Law and price points per MIPs Mainframes, RISC computers, Parallel processing • Open source movement in operating systems • Enterprise Resource Planning software • Object oriented programming MBA Essentials Session X Information......

Words: 1553 - Pages: 7