Theory


Fundamentals

Unsupervised Learning by Probabilistic Latent Semantic Analysis (2001)
Expectation-Propagation for the Generative Aspect Model (2002)
Latent Dirichlet Allocation (2003)
On an Equivalence between PLSI and LDA (2003)
Finding Scientific Topics (2004)
On Smoothing and Inference for Topic Models (2009)
Rethinking LDA: Why Priors Matter (2009)
Accounting for Burstiness in Topic Models (2009)
Posterior Contraction of the Population Polytope in Finite Admixture Models (2012)
Understanding the Limiting Factors of Topic Modeling via Posterior Contraction Analysis (2014)

Inference

WarpLDA: a Simple and Efficient O(1) Algorithm for Latent Dirichlet Allocation (2015)
SAME but Different: Fast and High-Quality Gibbs Parameter Estimation (2014)
LightLDA: Big Topic Models on Modest Compute Clusters (2014)
Scalable Inference for Logistic-Normal Topic Models (2013)
Variational Inference in Nonconjugate Models (2013)
Rethinking Collapsed Variational Bayes Inference for LDA (2012)
Practical Collapsed Variational Bayes Inference for Hierarchical Dirichlet Process (2012)
Deterministic Single-Pass Algorithm for LDA (2010)
Quantum Annealing for Variational Bayes Inference (2009)
Gibbs Sampling for Logistic Normal Topic Models with Graph-Based Priors (2008)
Collapsed Variational Inference for HDP (2007) (presenting an improved version of CVB for LDA)
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation (2006)

Online learning

A Filtering Approach to Stochastic Variational Inference (2014)
Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation (2013)
An Adaptive Learning Rate for Stochastic Variational Inference (2013)
Stochastic variational inference (2013)
Sparse stochastic inference for latent Dirichlet allocation(2012)
A Practical Algorithm for Topic Modeling with Provable Guarantees (2012)
Online Variational Inference for the Hierarchical Dirichlet Process (2011)
Online Learning for Latent Dirichlet Allocation (2010)
Online Inference of Topics with Latent Dirichlet Allocation (2009)
On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking (2008)
Topic Models over Text Streams: a Study of Batch and Online Unsupervised Learning (2006)

Parallelization

Mr. LDA: A Flexible Large Scale Topic Modeling Package using Variational Inference in MapReduce (2012)
PLDA+: Parallel latent dirichlet allocation with data placement and pipeline processing (2011)
An Architecture for Parallel Topic Models (2010)
Distributed Algorithms for Topic Models (2009)
Parallel Inference for Latent Dirichlet Allocation on Graphics Processing Units (2009)
PLDA: Parallel Latent Dirichlet Allocation for Large-scale Applications (2009)
Efficient Methods for Topic Model Inference on Streaming Document Collections (2009)
Fast Collapsed Gibbs Sampling for Latent Dirichlet Allocation (2008)
Asynchronous Distributed Learning of Topic Models (2008)
Distributed Inference for Latent Dirichlet Allocation (2007)
Parallelized Variational EM for Latent Dirichlet Allocation: An Experimental Evaluation of Speed and Scalability (2007)

Variants

Admixture of Poisson MRFs (2014)
On Modelling Non-linear Topical Dependencies (2014)
Latent Gaussian Models for Topic Modeling (2014)
Sparse online topic models (2013)
Online Latent Dirichlet Allocation with Infinite Vocabulary (2013)
Kernel Topic Models (2012)
Factorial LDA:Sparse Multi-Dimensional Text Models (2012)
Sparse Additive Generative Models of Text (2011)
Gaussian Process Topic Models (2010)
Unsupervised organization of image collections: taxonomies and beyond (2010)
Topic Models with Power-Law Using Pitman-Yor Process (2010)
Term Weighting Schemes for Latent Dirichlet Allocation (2010)
Topic Models Conditioned on Relations (2010)
Discriminative Topic Modeling based on Manifold Learning (2010)
Interactive Topic Modeling (2011)
A Two-dimensional Topic-aspect Model for Discovering Multi-faceted Topics (2010)
Conditional Topic Random Fields (2010)
Incorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors (2009)
Markov Random Topic Fields (2009)
A Bayesian Hierarchical Topic Model for Political Texts: Measuring Expressed Agendas in Senate Press Releases (2009)
Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression (2008)
Mixtures of Hierarchical Topics with Pachinko Allocation (2007)
Generalized Component Analysis for Text with Heterogeneous Attributes (2007)
Supervised Topic Models (2007)
A Correlated Topics Model Of Science (2007)
Nonparametric Bayes Pachinko Allocation. (2007)
Hierarchical Dirichlet Processes (2005)
Hierarchical Topic Models and the Nested Chinese Restaurant Process (2003)

Supervised learning

Topic models for taxonomies (2012)
Hierarchically Supervised Latent Dirichlet Allocation (2011)
Partially Labeled Topic Models for Interpretable Text Mining (2011)
DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification (2008)
Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora) (2009)
MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification (2009)
Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression (2008)
Conditionally Trained Latent Dirichlet Allocation for Text Modeling and Categorization (2008)
Supervised topic models (2007)
Bayesian Document Generative Model with Explicit Multiple Topics (2007)

Evaluation

Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality (2014)
Sometimes Average is Best: The Importance of Averaging for Prediction using MCMC Inference in Topic Modeling (2014)
A recursive estimate for the predictive likelihood in a topic model (2013)
Evaluating Topic Coherence Using Distributional Semantics (2013)
Bayesian Checking for Topic Models (2011)
Automatic Evaluation of Topic Coherence (2010)
External Evaluation of Topic Models (2009)
Reading Tea Leaves: How Humans Interpret Topic Models (2009)
Evaluation Methods for Topic Models (2009)



Applications


Visualization

Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment (2013)
Termite: Visualization Techniques for Assessing Textual Topic Models (2012)
Visualizing Topic Models (2012)
TopicViz: Semantic Navigation of Document Collections (2011)
The Topic Browser An Interactive Tool for Browsing Topic Models (2010)
Interactive, Topic-based Visual Text Summarization and Analysis (2009)
Probabilistic Latent Semantic Visualization: Topic Model for Visualizing Documents (2008)

Social networks

Probabilistic Latent Document Network Embedding (2014)
Online Topic Model for Twitter Considering Dynamics of User Interests and Topic Trends (2014)
Self-disclosure topic model for classifying and analyzing Twitter (2014)
The dual-sparse topic model: mining focused topics and focused terms in short text (2014)
A Temporal Context-Aware Model for User Behavior Modeling in Social Media Systems (2014)
Latent community discovery through enterprise user search query modeling (2014)
Discriminative Relational Topic Models (2013)
Incorporating popularity in topic models for social network analysis (2013)
Modeling Overlapping Communities with Node Popularities (2013)
Social-network analysis using topic models (2012)
Semantic Social Network Analysis with Text Corpora (2012)
LeadLag LDA: Estimating Topic Specific Leads and Lags of Information Outlets (2011)
TopicFlow model: Unsupervised learning of topic specific influences of hyperlinked documents (2011)
Comparing Twitter and Traditional Media Using Topic Models (2011)
Empirical Study of Topic Modeling in Twitter (2010)
Characterizing Microblogs with Topic Models (2010)
TwitterRank: Finding Topic-sensitive Influential Twitterers (2010)
Mining Topic-Level Influence in Heterogeneous Networks (2010)
Utilizing Context in Generative Bayesian Models for Linked Corpus (2010)
Connections between the Lines: Augmenting Social Networks with Text (2009)
Relational Topic Models for Document Networks (2009)
---> Hierarchical Relational Models for Document Networks (2010)
Topic-Link LDA: Joint Models of Topic and Author Community (2009)
iTopicModel: Information Network-Integrated Topic Modeling (2009)
Modeling Hidden Topics on Document Manifold (2008) (an extension of PLSI)
Topic Modeling with Network Regularization (2008)
Latent Topic Models for Hypertext (2008)
Link-PLSA-LDA: A New Unsupervised Model for Topics and Influence of Blogs (2008)
Arnetminer: Extraction and Mining of Academic Social Networks (2008)
Joint Latent Topic Models for Text and Citations (2008)
Community Evolution in Dynamic Multi-mode Networks (2008)
Social Topic Models for Community Extraction (2008)
Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email (2007)
An LDA-based Community Structure Discovery Approach for Large-scale Social Networks (2007)
Probabilistic Community Discovery Using Hierarchical latent Gaussian Mixture Model (2007)
Joint Group and Topic Discovery from Relations and Text (2007)
Probabilistic Models for Discovering E-communities (2006)
Group and Topic Discovery from Relations and Text (2005)

Digital Humanities

Journal of Digital Humanities Vol. 2, No. 1 Winter 2012 (2012)
Data Science for Politics, Policy, and Government (2015)

Sentiment analysis and opinion mining

That’s So Annoying!!!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve Tweets (2015)
The Inverse Regression Topic Model (2014)
A Topic Model for Building Fine-grained Domain-specific Emotion (2014)
A Sentiment-aligned Topic Model for Product Aspect Rating Prediction (2014)
The Structural Topic Model and Applied Social Science (2013)
The FLDA Model for Aspect-based Opinion Mining: Addressing the Cold Start Problem (2013)
Dynamic Joint Sentiment-Topic Model (2012)
User-sentiment topic model: refining user's topics with sentiment information (2012)
Aspect and Sentiment Unification Model for Online Review Analysis (2011)
Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation (2010)
Latent Aspect Rating Analysis on Review Text Data: A Rating Regression Approach (2010)
An Unsupervised Aspect-sentiment Model for Online Reviews (2010)
Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid (2010)
Rated Aspect Summarization of Short Comments (2009)
Learning Document-level Semantic Properties from Free-text Annotations (2009)
Joint Sentiment/Topic Model for Sentiment Analysis (2009)
Mining Multi-faceted Overviews of Arbitrary Topics in a Text Collection (2008)
Modeling Online Reviews with Multi-grain Topic Models (2008)
A Joint Model of Text and Aspect Ratings for Sentiment Summarization (2008)
Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs (2007)
Opinion Integration through Semi-supervised Topic Modeling (2008)

Temporal and spatial data analysis

An event extraction model based on timeline and user analysis in Latent Dirichlet allocation (2014)
Finding Bursty Topics from Microblogs (2012)
A Topic Model for Melodic Sequences (2012)
Studying software evolution using topic models (2012)
Geographical Topic Discovery and Comparison (2011)
Random Field Topic Model for Semantic Region Analysis in Crowded Scenes from Tracklets (2011)
Online Multiscale Dynamic Topic Models (2010)
A Latent Variable Model for Geographic Lexical Variation (2010)
Evolutionary Hierarchical Dirichlet Processes for Multiple Correlated Time-varying Corpora (2010)
Mining Common Topics from Multiple Asynchronous Text Streams (2009)
Topic Evolution in a Stream of Documents (2009)
Studying the History of Ideas Using Topic Models (2008)
The Dynamic Hierarchical Dirichlet Process (2008)
Continuous Time Dynamic Topic Model (2008)
Multiscale Topic Tomography (2007)
Mining Correlated Bursty Topic Patterns from Coordinated Text Streams (2007)
Dynamic Mixture Models for Multiple Time Series (2007)
Spatial Latent Dirichlet Allocation (2007)
A Probabilistic Approach to Spatiotemporal Theme Pattern Mining on Weblogs (2006)
Dynamic Topic Models (2006)
Topics over Time: a Non-Markov Continuous-time Model of Topical Trends (2006)
Discovering Evolutionary Theme Patterns from Text: an Exploration of Temporal Text Mining (2005)

Scientific publication mining

Context Sensitive Topic Models for Author Influence in Document Networks (2012)
A Language-based Approach to Measuring Scholarly Impact (2010)
Latent Interest-Topic Model: Finding the Causal Relationships behind Dyadic Data (2010)
Block-LDA: Jointly Modeling Entity-annotated Text and Entity-entity Links (2010)
Context-aware Citation Recommendation (2010)
Learning Author-Topic Models from Text Corpora (2010)
Detecting Topic Evolution in Scientific Literature: How Can Citations Help? (2009)
Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation (2009)
A Topic Modeling Approach and its Integration into the Random Walk Framework for Academic Search (2008)
Joint Latent Topic Models for Text and Citations (2008)
Unsupervised Prediction of Citation Influences (2007)
Co-ranking Authors and Documents in a Heterogeneous Network (2007)
Expertise Modeling for Matching Papers with Reviewers (2007)
Topic Evolution and Social Interactions: How Authors Effect Research (2006)
Modeling Individual Differences using Dirichlet Processes (2006)
Multi-Aspect Expertise Matching for Review Assignment (an extension of PLSI)
Group and Topic Discovery from Relations and Their Attributes
Mining a Digital Library for Influential Authors (2007)
Bibliometric Impact Measures Leveraging Topic Analysis (2006)
Statistical Entity-Topic Models (2006)
The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks (2005)
Learning Author Topic Models from Text Corpora (2005)
Mixed-Membership Models of Scientific Publications (2004)
Probabilistic Author-Topic Models for Information Discovery (2004)
The Author-Topic Model for Authors and Documents (2004)

Information retrieval

Collaborative Personalized Twitter Search with Topic-Language Models (2014)
An unsupervised topic segmentation model incorporating word order (2013)
Regularized Latent Semantic Indexing: A New Approach to 2 Large-Scale Topic Modeling (2013)
Semantic Hashing using Tags and Topic Modeling (2013)
Building User Profiles from Topic Models for Personalised Search (2013)
An LDA-smoothed relevance model for document expansion: a case study for spoken document retrieval (2013)
The Generalized Dirichlet Distribution in Enhanced Topic Detection (2012)
Clickthrough-Based Latent Semantic Models for Web Search (2011)
Regularized Latent Semantic Indexing (2011)
Bridging Topic Modeling and Personalized Search (2010)
A Comparative Study of Utilizing Topic Models for Information Retrieval (2009)
Exploring Topic-based Language Models for Effective Web Information Retrieval (2008)
Evaluating Topic Models for Information Retrieval (2008)
Exploring Social Annotations for Information Retrieval (2008)
LDA-based Document Models for Ad-hoc Retrieval (2006)
Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model (2006)

Information extraction, Domain adaptation

Polylingual Tree-Based Topic Models for Translation Domain Adaptation (2014)
Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data (2014)
Leveraging multi-domain prior knowledge in topic models (2013)
Transfer Topic Modeling with Ease and Scalability (2012)
Topic model for analyzing purchase data with price information (2012)
Optimizing Semantic Coherence in Topic Models (2011)
Learning to Adapt Web Information Extraction Knowledge and Discovering New Attributes via a Bayesian Approach (2010)
Employing Topic Models for Pattern-based Semantic Class Discovery (2009)
Semi-supervised Extraction of Entity Aspects Using Topic Models (2009)
Modeling Documents by Combining Semantic Concepts with Unsupervised Statistical Learning (2008)
An Unsupervised Framework for Extracting and Normalizing Product Attributes from Multiple Web Sites (2008)
A Probabilistic Approach for Adapting Information Extraction Wrappers and Discovering New Attributes (2004)

Recommendation

Ranking Linked-Entities in a Sentiment Graph (2014)
LA-LDA: A Limited Attention Topic Model for Social Recommendation (2013)
Community-Based User Recommendation in Uni-Directional Social Networks (2013)
Feature LDA: a Supervised Topic Model for Automatic Detection of Web API Documentations from the Web (2012)
Using latent topics to enhance search and recommendation in Enterprise Social Software (2012)
Collaborative Topic Modeling for Recommending Scientific Articles (2011)
---> Collaborative Topic Regression with Social Matrix Factorization for Recommendation Systems (2012)
Auralist: Introducing Serendipity into Music Recommendation (2011)
Topic Modeling for Personalized Recommendation of Volatile Items (2010)
Linear Submodular Bandits and their Application to Diversified Retrieval (2011)
Turning Down the Noise in the Blogosphere (2009)

Annotations, Tagging, Labeling

Tagging Your Tweets: A Probabilistic Modeling of Hashtag Annotation in Twitter (2014)
Automatic Labelling of Topic Models Learned from Twitter by Summarisation (2014)
Improving LDA topic models for microblogs via tweet pooling and automatic labeling (2013)
A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching (2013)
Learning Topical Translation Model for Microblog Hashtag Suggestion (2013)
Automatic Labeling of Multinomial Topic Models (2011)
Context Modeling for Ranking and Tagging Bursty Features in Text Streams (2010)
The Topic-Perspective Model for Social Tagging Systems (2010)
A Probabilistic Topic-Connection Model for Automatic Image Annotation (2010)
Ranking Social Bookmarks Using Topic Models (2010)
Clustering the Tagged Web (2009)
Latent Dirichlet Allocation for Tag Recommendation (2009)
Tag-LDA for Scalable Real-time Tag Recommendation (2009)
Learning Document-level Semantic Properties from Free-Text Annotations (2008)
Generating Summary Keywords for Emails Using Topics (2008)
Using Multiple Segmentations to Discover Objects and their Extent in Image Collections (2006)
Modeling Annotated Data (2003)

Summarization

Query-focused Multi-Document Summarization: Combining a Topic Model with Graph-based Semi-supervised Learning (2014)
Generating Aspect-oriented Multi-Document Summarization with Event-aspect model (2011)
Topical Keyphrase Extraction from Twitter (2011)
Generating Templates of Entity Summaries with an Entity-Aspect Model and Pattern Mining (2010)
A Hybrid Hierarchical Model for Multi-Document Summarization (2010)
Visually Summarizing the Evolution of Documents under a Social Tag (2010)
Topic-based Multi-document Summarization with Probabilistic Latent Semantic Analysis (2009)
Multi-topic based Query-oriented Summarization (2009)
Multi-Document Summarization using Sentence-based Topic Models (2009)
Generating Impact-Based Summaries for Scientific Literature (2008)
Latent Dirichlet Allocation and Singular Value Decomposition Based Multi-document Summarization (2008)
Bayesian Query-Focused Summarization (2006)
Topic Segmentation with an Aspect Hidden Markov Model (2001)

NLP

A Context-Aware Topic Model for Statistical Machine Translation (2015)
Unsupervised learning of rhetorical structure with un-topic models (2014)
Classifying Idiomatic and Literal Expressions Using Topic Models and Intensity of Emotions (2014)
Learning Polylingual Topic Models from Code-Switched Social Media Documents (2014)
Probabilistic Distributional Semantics with Latent Variable Models (2014)
Authorship attribution based on a probabilistic topic model (2013)
An N-Gram Topic Model for Time-Stamped Documents (2013)
An Entity-Topic Model for Entity Linking (2012)
Word Sense Induction for Novel Sense Detection (2012)
Probabilistic models of similarity in syntactic context (2011)
Authorship Attribution with Latent Dirichlet Allocation (2011)
Structured Relation Discovery using Generative Models (2011)
A Framework for Incorporating General Domain Knowledge into Latent Dirichlet Allocation using First-Order Logic (2011)
Translingual Document Representations from Discriminative Projections (2010)
Extracting Multilingual Topics from Unaligned Comparable Corpora (2010)
Topic Models for Word Sense Disambiguation and Token-based Idiom Detection (2010)
A Latent Dirichlet Allocation method for Selectional Preferences (2010)
Cross-Lingual Latent Topic Extraction (2010)
Exploiting Conversation Structure in Unsupervised Topic Segmentation for Emails (2010)
Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases (2010)
Word Features for Latent Dirichlet Allocation (2010)
Topic models for meaning similarity in context (2010)
Multilingual Topic Models for Unaligned Text (2009)
Content Modeling Using Latent Permutations (2009)
Mining Multilingual Topics from Wikipedia (2009)
Syntactic Topic Models (2009)
Named Entity Recognition in Query (2009)
Markov topic models (2009)
Modeling Syntactic Structures of Topics with a Nested HMM-LDA (2009)
Polylingual Topic Models (2009)
Word Topic Models for Spoken Document Retrieval and Transcription (2009)
A Topic Model for Word Sense Disambiguation (2007)
Improving Word Sense Disambiguation Using Topic Features (2007)
Topical n-grams: Phrase and Topic Discovery, with an Application to Information Retrieval (2007)
A Bayesian LDA-based Model for Semi-supervised Part-of-speech Tagging (2007)
Topics in Semantic Representation (2007)
Topic Modeling: Beyond Bag-of-words (2006)
Integrating Topics and Syntax (2005)

DB

Topic Cube: Topic Modeling for OLAP on Multidimensional Text Databases (2009)

etc

Topic Modeling Workshop at NIPS 2013