kdd34

SIGKDD(KDD) 2006 论文列表

Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, August 20-23, 2006.

Is there a grand challenge or X-prize for data mining?
Beyond classification and ranking: constrained optimization of the ROI.
Camouflaged fraud detection in domains with complex relationships.
YALE: rapid prototyping for complex data mining tasks.
Maximum profit mining and its application in software development.
Discovering significant OPSM subspace clusters in massive gene expression data.
A component-based framework for knowledge discovery in bioinformatics.
Mining citizen science data to predict orevalence of wild bird species.
Identifying "best bet" web search results by mining past user behavior.
Opportunity map: identifying causes of failure - a deployed data mining system.
Understandable models Of music collections based on exhaustive feature generation with temporal statistics.
GPLAG: detection of software plagiarism by program dependence graph analysis.
Mining for proposal reviewers: lessons learned at the national science foundation.
Pragmatic text mining: minimizing human effort to quantify many issues in call logs.
Onboard classifiers for science event detection on a remote sensing spacecraft.
Computer aided detection via asymmetric cascade of sparse hyperplane classifiers.
Data mining challenges in the automotive domain.
Information extraction, data mining and joint inference.
Capital One's statistical problems: our top ten list.
Introducing perpetual analytics.
BLOSOM: a framework for mining arbitrary boolean expressions.
Linear prediction models with graph regularization for web-page categorization.
Identifying bridging rules between conceptual clusters.
Attack detection in time series for recommender systems.
Mining progressive confident rules.
Coherent closed quasi-clique discovery from large dense graph databases.
Integration of semantic-based bipartite graph representation and mutual refinement strategy for biomedical literature clustering.
Utility-based anonymization using local recoding.
K-means clustering versus validation measures: a data distribution perspective.
Discovering interesting patterns through user's interactive feedback.
Outlier detection by sampling with accuracy guarantees.
Incremental approximate matrix factorization for speeding up support vector machines.
(alpha, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing.
Semi-supervised time series classification.
A large-scale analysis of query logs for assessing personalization opportunities.
Suppressing model overfitting in mining concept-drifting data streams.
Summarizing itemset patterns using probabilistic models.
Efficient kernel feature extraction for massive data sets.
Mining long-term search history to improve search accuracy.
Combining linguistic and statistical analysis to extract relations from web documents.
MONIC: modeling and monitoring cluster transitions.
Naïve filterbots for robust cold-start recommendations.
Automatic mining of fruit fly embryo images.
Mining for misconfigured machines in grid systems.
Statistical entity-topic models.
Clustering based large margin classification: a scalable approach using SOCP formulation.
Algorithms for time series knowledge mining.
Efficient multidimensional data representations based on multiple correspondence analysis.
A new multi-view regression approach with an application to customer wallet estimation.
A mixture model for contextual text mining.
Visual data mining using principled projection algorithms and information visualization techniques.
Clustering pair-wise dissimilarity data into partially ordered sets.
Sampling from large graphs.
Bias and controversy: beyond the statistical deviation.
Cryptographically private support vector machines.
Structure and evolution of online social networks.
Algorithms for storytelling.
Reducing the human overhead in text categorization.
CFI-Stream: mining closed frequent itemsets in data streams.
Polynomial association rules with applications to logistic regression.
Dynamic, real-time forecasting of online auctions via functional models.
Recommendation method for extending subscription periods.
Mining relational data through correlation-based multiple view validation.
Algorithms for discovering bucket orders from data.
Evolutionary clustering.
Single-pass online learning: performance, voting schemes and online feature selection.
Classification features for attack detection in collaborative recommender systems.
Model compression.
Query-time entity resolution.
A framework for analysis of dynamic social networks.
CCCS: a top-down associative classifier for imbalanced class distribution.
On privacy preservation against adversarial data mining.
Outlier detection by active learning.
Simultaneous record detection and attribute labeling in web data extraction.
Event detection from evolution of click-through data.
Extracting key-substring-group features for text classification.
Supervised probabilistic principal component analysis.
Regularized discriminant analysis for high dimensional, low sample size data.
Extracting redundancy-aware top-k patterns.
Discovering significant rules.
Topics over time: a non-Markov continuous-time model of topical trends.
Anonymizing sequential releases.
Center-piece subgraphs: problem definition and fast solutions.
Mining distance-based outliers from large databases in any metric space.
Acclimatizing taxonomic semantics for hierarchical content classification from semantics to data-driven taxonomy.
Beyond streams and graphs: dynamic tensor analysis.
Learning sparse metrics via linear programming.
Using structure indices for efficient approximation of network properties.
Aggregating time partitions.
Generating semantic annotations for frequent patterns with context analysis.
Tensor-CUR decompositions for tensor-based data.
Unsupervised learning on k-partite graphs.
Fast mining of high dimensional expressive contrast patterns using zero-suppressed binary decision diagrams.
Rule interestingness analysis using OLAP operations.
Very sparse random projections.
Workload-aware anonymization.
New EM derived from Kullback-Leibler divergence.
Hierarchical topic segmentation of websites.
Measuring and extracting proximity in networks.
Maximally informative k-itemsets and their efficient discovery.
Mining quantitative correlated patterns using an information-theoretic approach.
Training linear SVMs in linear time.
Adaptive event detection with time-varying poisson processes.
Frequent subgraph mining in outerplanar graphs.
Learning the unified kernel machines for classification.
A new efficient probabilistic model for mining labeled ordered trees.
Assessing data mining results via swap randomization.
Quantifying trends accurately despite classifier error and class imbalance.
Reverse testing: an efficient framework to select amongst classifiers under sample selection bias.
A general framework for accurate and fast regression by data summarization in random decision trees.
Orthogonal nonnegative matrix t-factorizations for clustering.
Estimating the global pagerank of web communities.
NeMoFinder: dissecting genome-wide protein-protein interactions with meso-scale network motifs.
Mining rank-correlated sets of numerical attributes.
Out-of-core frequent pattern mining on a commodity PC.
Efficient anonymity-preserving data collection.
Robust information-theoretic clustering.
Detecting outliers using transduction and statistical testing.
Group formation in large social networks: membership, growth, and evolution.
Global distance-based segmentation of trajectories.
Spatial scan statistics: approximations and performance study.
Learning to rank networked entities.
Deriving quantitative models for correlation clusters.
Next frontier.
New cached-sufficient statistics algorithms for quickly answering statistical questions.
Self-Organizing wireless sensor networks in action.