cikm 2008 论文列表

Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM 2008, Napa Valley, California, USA, October 26-30, 2008.

Categorizing blogger's interests based on short snippets of blog posts.
A matrix-based approach for semi-supervised document co-clustering.
A coarse-grain grid-based subspace clustering method for online multi-dimensional data streams.
Clustering multi-way data via adaptive subspace iteration.
Semi-supervised text categorization by active search.
Multi-scale characterization of social network dynamics in the blogosphere.
Exploiting context to detect sensitive information in call center conversations.
Incorporating topical support documents into a small training set in text categorization.
Effective pattern taxonomy mining in text documents.
Boosting social annotations using propagation.
A spam resistant family of concavo-convex ranks for link analysis.
Trust, authority and popularity in social information retrieval.
SHOPSMART: product recommendations through technical specifications and user reviews.
Utilization of navigational queries for result presentation and caching in search engines.
Measuring user preference changes in digital libraries.
Using a graph-based ontological user profile for personalizing search.
Using the current browsing context to improve search relevance.
The effect of contextualization at different granularity levels in content-oriented xml retrieval.
Incorporating place name extents into geo-ir ranking.
A georeferencing multistage method for locating geographic context in web search.
Efficient estimation of the size of text deep web data source.
Re-considering neighborhood-based collaborative filtering parameters in the context of new data.
Suppressing outliers in pairwise preference ranking.
Workload-based optimization of integration processes.
Polyhedral transformation for indexed rank order correlation queries.
Protecting location privacy against location-dependent attack in mobile services.
Table summarization with the help of domain lattices.
Energy-efficient skyline query processing and maintenance in sensor networks.
Combining concept hierarchies and statistical topic models.
Scalable complex pattern search in sequential data.
In the development of a spanish metamap.
Decomposition of terminology graphs for domain knowledge acquisition.
Fast spatial co-location mining without cliqueness checking.
On quantifying changes in temporally evolving dataset.
Semi-supervised metric learning by maximizing constraint margin.
Detecting significant distinguishing sets among bi-clusters.
Pattern-based semantic class discovery with multi-membership support.
Deriving non-redundant approximate association rules from hierarchical datasets.
GHOST: an effective graph-based framework for name distinction.
Efficient frequent pattern mining over data streams.
Collaborative partitioning with maximum user satisfaction.
Group-based learning: a boosting approach.
Entity-based query reformulation using wikipedia.
Search-based query suggestion.
Answering general time sensitive queries.
Integrating clustering and multi-document summarization to improve document understanding.
A novel statistical chinese language model and its application in pinyin-to-character conversion.
Evaluating topic models for information retrieval.
Ranking in folksonomy systems: can context help?
Semi-supervised ranking aggregation.
Estimating retrieval effectiveness using rank distributions.
Mining named entity transliteration equivalents from comparable corpora.
Modeling document features for expert finding.
A survey of pre-retrieval query performance predictors.
A latent variable model for query expansion using the hidden markov model.
Improve the effectiveness of the opinion retrieval and opinion polarity classification.
Natural language retrieval of grocery products.
Characterization of TPC-H queries for a column-oriented database on a dual-core amd athlon processor.
Evaluating partial tree-pattern queries on XML streams.
Estimating the number of answers with guarantees for structured queries in p2p databases.
Query optimization in xml-based information integration.
A light weighted damage tracking quarantine and recovery scheme for mission-critical database systems.
Data degradation: making private data less sensitive over time.
Efficient processing of probabilistic spatio-temporal range queries over moving objects.
Closing the loop in webpage understanding.
Tag-based filtering for personalized bookmark recommendations.
A novel email abstraction scheme for spam detection.
Coreex: content extraction from online news articles.
Efficient web matrix processing based on dual reordering.
Representative entry selection for profiling blogs.
Estimating real-valued characteristics of criminals from their recorded crimes.
Handling implicit geographic evidence for geographic ir.
Using tag semantic network for keyphrase extraction in blogs.
Summarization of social activity over time: people, actions and concepts in dynamic networks.
Large maximal cliques enumeration in sparse graphs.
A method to predict social annotations.
Coreference resolution using expressive logic models.
Overlapping community structure detection in networks.
An integration strategy for mining product features and opinions.
Metadata extraction and indexing for map search in web documents.
Corpus microsurgery: criteria optimization for medical cross-language ir.
Investigating external corpus and clickthrough statistics for query expansion in the legal domain.
Siphon++: a hidden-webcrawler for keyword-based interfaces.
Cross-document cross-lingual coreference retrieval.
Passage relevance models for genomics search.
Using sequence classification for filtering web pages.
Winnowing-based text clustering.
Searching the wikipedia with contextual information.
Nested region algebra extended with variables for tag-annotated text search.
Online spam-blog detection through blog search.
An extension of PLSA for document clustering.
A note on search based forecasting of ad volume in contextual advertising.
Speed up semantic search in p2p networks.
An approximate string matching approach for handling incorrectly typed urls.
Yizkor books: a voice for the silent past.
Transaction reordering with application to synchronized scans.
PBFilter: indexing flash-resident data through partitioned summaries.
SQL extension for exploring multiple tables.
View and index selection for query-performance improvement: quality-centered algorithms and heuristics.
ROAD: an efficient framework for location dependentspatial queries on road networks.
Scaling up duplicate detection in graph data.
CE2: towards a large scale hybrid search engine with integrated ranking support.
Privacy-preserving data publishing for horizontally partitioned databases.
Identifying table boundaries in digital documents via sparse line detection.
Academic conference homepage understanding using constrained hierarchical conditional random fields.
Intra-document structural frequency features for semi-supervised domain adaptation.
A system for finding biological entities that satisfy certain conditions from texts.
Cache-aware load balancing for question answering.
Answering questions with authority.
PROQID: partial restarts of queries in distributed databases.
Adaptive distributed indexing for structured peer-to-peer networks.
Valid scope computation for location-dependent spatial query in mobile broadcast environments.
Extremely fast text feature extraction for classification and indexing.
Information shared by many objects.
Scalable community discovery on textual data with relations.
Identification of class specific discourse patterns.
Using structured text for large-scale attribute extraction.
A densitometric approach to web page segmentation.
A generative retrieval model for structured documents.
Structural relevance: a common basis for the evaluation of structured document retrieval.
Trada: tree based ranking function adaptation.
Modeling multi-step relevance propagation for expert finding.
Dr. Searcher and Mr. Browser: a unified hyperlink-click graph.
Multi-aspect expertise matching for review assignment.
An effective algorithm for mining 3-clusters in vertically partitioned data.
EDSC: efficient density-based subspace clustering.
Data weaving: scaling up the state-of-the-art in data clustering.
A consensus based approach to constrained clustering of software requirements.
An effective statistical approach to blog post opinion retrieval.
Blog site search using resource selection.
Key blog distillation: ranking aggregates.
Automatic online news topic ranking using media focus and user attention based on aging theory.
A two-stage text mining model for information filtering.
Search advertising using web relevance feedback.
To swing or not to swing: learning when (not) to advertise.
Unsolved problems in search: (and how we approach them).
The social (open) workspace.
Structure feature selection for graph classification.
Real-time data pre-processing technique for efficient feature extraction in large scale datasets.
Mining influential attributes that capture class and group contrast behaviour.
REDUS: finding reducible subspaces in high dimensional data.
A random walk on the red carpet: rating movies with user reviews and pagerank.
Probabilistic polyadic factorization and its application to personalized recommendation.
SoRec: social recommendation using probabilistic matrix factorization.
Tapping on the potential of q&a community by recommending answer providers.
Modeling hidden topics on document manifold.
AdaSum: an adaptive model for summarization.
Ranked feature fusion models for ad hoc retrieval.
Joke retrieval: recognizing the same joke told differently.
Records retention in relational database systems.
Dual encryption for query integrity assurance.
Vanity fair: privacy in querylog bundles.
Efficient techniques for document sanitization.
Peer production of structured knowledge -: an empirical study of ratings and incentive mechanisms.
Association thesaurus construction methods based on link co-occurrence analysis for wikipedia.
Finding informative commonalities in concept collections.
Wildcards for lightweight information integration in virtual desktops.
Fast correlation analysis on time series datasets.
Identification of gene function using prediction by partial matching (PPM) language models.
Predicting individual disease risk based on medical history.
Fast mining of complex time-stamped events.
On low dimensional random projections and similarity search.
High-dimensional descriptor indexing for large multimedia databases.
Translation enhancement: a new relevance feedback method for cross-language information retrieval.
Simultaneous multilingual search for translingual information retrieval.
Learning latent semantic relations from clickthrough data for query suggestion.
Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs.
Matching task profiles and user needs in personalized web search.
Can phrase indexing help to process non-phrase queries?
Modeling LSH for performance tuning.
Supporting sub-document updates and queries in an inverted index.
A new method for indexing genomes using on-disk suffix trees.
Exploiting pipeline interruptions for efficient memory allocation.
Proactive learning: cost-sensitive active learning with multiple imperfect oracles.
A framework for estimating complex probability density structures in data streams.
The query-flow graph: model and applications.
Clustered subset selection and its applications on it service metrics.
How evaluator domain expertise affects search result relevance judgments.
Comparing metrics across TREC and NTCIR: the robustness to system bias.
Statistical power in retrieval experimentation.
Retrievability: an evaluation measure for higher order information access tasks.
A heuristic approach for checking containment of generalized tree-pattern queries.
Pruning nested XQuery queries.
Some rewrite optimizations of DB2 XQuery navigation.
Rewriting of visibly pushdown languages for xml data integration.
Markov logic: a unifying language for knowledge and information management.
Learning to link with wikipedia.
Discovering leaders from community actions.
Non-local evidence for expert finding.
Mining term association patterns from search logs for effective query reformulation.
Query suggestion using hitting time.
Active relevance feedback for difficult queries.
Understanding the relationship between searchers' queries and information goals.
Improved query difficulty prediction for the web.
Relating dependent indexes using dempster-shafer theory.
Revisiting the relationship between document length and relevance.
TinyLex: static n-gram index pruning with perfect recall.
Generalized inverse document frequency.
Linear time membership in a class of regular expressions with interleaving and counting.
Real-time new event detection for video streams.
SNIF TOOL: sniffing for patterns in continuous streams.
Anomaly-free incremental output in stream processing.
Inferring semantic query relations from collective user behavior.
Predicting web spam with HTTP session information.
Spam characterization and detection in peer-to-peer file-sharing systems.
An algorithm to determine peer-reviewers.
Characterizing and predicting community members from evolutionary and heterogeneous networks.
On effective presentation of graph patterns: a structural representative approach.
Link privacy in social networks.
Local approximation of pagerank and reverse pagerank.
Learning a two-stage SVM/CRF sequence classifier.
BNS feature scaling: an improved representation over tf-idf for svm text classification.
Kernel methods, syntax and semantics for relational text categorization.
Exploiting temporal contexts in text classification.
Mining social networks using heat diffusion processes for marketing candidates selection.
Social tags: meaning and suggestions.
Comparing citation contexts for information retrieval.
Can all tags be used for search?
A novel optimization approach to efficiently process aggregate similarity queries in metric access methods.
Modeling and exploiting query interactions in database systems.
A step towards incremental maintenance of the composed schema mapping.
Content-based filtering for efficient online materialized view maintenance.
An empirical study of required dimensionality for large-scale latent semantic indexing applications.
MedSearch: a specialized search engine for medical information retrieval.
Semi-automated logging of contact center telephone calls.
Web-scale named entity recognition.
Classifying networked entities with modularity kernels.
Transfer learning from multiple source domains via consensus regularization.
A sparse gaussian processes classification framework for fast tag suggestions.
Error-driven generalist+experts (edge): a multi-stage ensemble framework for text categorization.
Are click-through data adequate for learning web search rankings?
Achieving both high precision and high recall in near-duplicate detection.
Efficient and effective link analysis with precomputed salsa maps.
How does clickthrough data reflect retrieval quality?
Integrating web query results: holistic schema matching.
A language for manipulating clustered web documents results.
Minimum-effort driven dynamic faceted search in structured databases.
Dynamic faceted search for discovery-driven analysis.
Humane data mining.