Cloudera Fast Forward Labs Research Previews

Fast Forward Labs research now available without a subscription

Moving forward, all new reports will be publicly available and free to download. In addition, we will be providing access to updated versions of older reports over time, so check back often to explore available free research.

Free research reports

Explore our latest research reports and prototypes, freely accessible to all.

Text Style Transfer

The NLP task of text style transfer (TST) aims to automatically control the style attributes of a piece of text while preserving the content, which is an important consideration for making NLP more user-centric. In this report, we explore text style transfer through an applied use case — neutralizing subjectivity bias in free text. Along the way, we describe our sequence-to-sequence modeling approach leveraging HuggingFace Transformers, and present a set of custom, reference-free evaluation metrics for quantifying model performance. Finally, we conclude with a discussion of ethics centered around our prototype: Exploring Intelligent Writing Assistance.

Read the report
Explore the code

Inferring Concept Drift Without Labeled Data

Concept drift occurs when the statistical properties of a target domain change over time causing model performance to degrade. Drift detection is generally achieved by monitoring a performance metric of interest and triggering a retraining pipeline when that metric falls below some designated threshold. However, this approach assumes ample labeled data is available at prediction time - an unrealistic constraint for many production systems. In this report, we explore various approaches for dealing with concept drift when labeled data is not readily accessible.

Read the report
Explore the code

Inferring Concept Drift Without Labeled Data

Exploring Multi-Objective Hyperparameter Optimization

We develop machine learning models against the “usual suspect” metrics like predictive accuracy, recall, and precision. However, these metrics are rarely truly all we care about. Production models must also satisfy physical requirements such as latency or memory footprint, or fairness constraints. Hyperparameter optimization becomes even more challenging when we have multiple metrics to optimize. Our latest research examines this “multi-objective” hyperparameter optimization scenario in detail.

Read the report
Explore the code

Exploring Multi-Objective Hyperparameter Optimization

Deep Learning for Automatic Offline Signature Verification

Handwritten signature verification aims to automatically discriminate between genuine and forged signatures, and is a particularly important challenge due to the ubiquity of handwritten signatures as a form of identification in legal, financial, and administrative domains. This research cycle explored the use of deep metric learning approaches - specifically siamese networks - combined with novel feature extraction methods to improve upon traditional techniques.

Read the report
Explore the code

Session-Based Recommender Systems

Recommendation systems have become a cornerstone of modern life, spanning sectors that include online retail, music and video streaming, and even content publishing. These systems help us navigate the sheer volume of content on the internet, allowing us to discover what’s interesting or important to us. A key trend over the past few years has been session-based recommendation algorithms that provide recommendations solely based on a user’s interactions in an ongoing session, and which do not require the existence of user profiles or their entire historical preferences.

Read the report
Explore the code

Few-Shot Text Classification

Text classification can be used for sentiment analysis, topic assignment, document identification, article recommendation, and more. While dozens of techniques now exist for this fundamental task, many of them require massive amounts of labeled data in order to be useful. Collecting annotations for your use case is typically one of the most costly parts of any machine learning application. In this report, we explore how latent text embeddings can be used with few (or even zero) training examples and provide insights into best practices for implementing this method.

Read the report
Explore the code

Structural Time Series

Time series data is ubiquitous. This report examines generalized additive models, which give us a simple, flexible, and interpretable means for modeling time series by decomposing them into structural components. We look at the benefits and trade-offs of taking a curve-fitting approach to time series, and demonstrate its use via Facebook’s Prophet library on a demand forecasting problem.

Read the report
Explore the code

Meta-Learning

In contrast to how humans learn, deep learning algorithms need vast amounts of data and compute and may yet struggle to generalize. Humans are successful in adapting quickly because they leverage their knowledge acquired from prior experience when faced with new problems. In this report, we explain how meta-learning can leverage previous knowledge acquired from data to solve novel tasks quickly and more efficiently during test time

Read the report
Explore the code

Automated Question Answering

Automated question answering is a user-friendly way to extract information from data using natural language. Thanks to recent advances in natural language processing, question answering capabilities from unstructured text data have grown rapidly. This blog series offers a walk-through detailing the technical and practical aspects of building an end-to-end question answering system.

Read the interactive blog series

Causality for Machine Learning

The intersection of causal inference and machine learning is a rapidly expanding area of research that's already yielding capabilities to enable building more robust, reliable, and fair machine learning systems. This report offers an introduction to causal reasoning including causal graphs and invariant prediction and how to apply causal inference tools together with classic machine learning techniques in multiple use-cases.

Read the report
Explore the prototype

Interpretability: 2020 Edition

Interpretability, or the ability to explain why and how a system makes a decision, can help us improve models, satisfy regulations, and build better products. Black-box techniques like deep learning have delivered breakthrough capabilities at the cost of interpretability. In this report, recently updated to include techniques like SHAP, we show how to make models interpretable without sacrificing their capabilities or accuracy.

Read the report

Deep Learning for Anomaly Detection

From fraud detection to flagging abnormalities in imaging data, there are countless applications for automatic identification of abnormal data. This process can be challenging, especially when working with large, complex data. This report explores deep learning approaches (sequence models, VAEs, GANs) for anomaly detection, when to use them, performance benchmarks, and product possibilities.

Read the report
Explore the prototype

Fast Forward Labs Deep Learning for Image Analysis - 2019 Edition report preview

Transfer Learning for Natural Language Processing

Natural language processing (NLP) technologies using deep learning can translate language, answer questions, and generate human-like text But these deep learning techniques require large, costly labeled datasets, expensive infrastructure, and scarce expertise. Transfer learning lifts these constraints by reusing and adapting a model’s understanding of language. Transfer learning is a good fit for any NLP application. In this report, we show how to use transfer learning to build high-performance NLP systems with minimal resources.

Read the report

Fast Forward Labs Transfer Learning for NLP report

Learning with Limited Labeled Data

Being able to learn with limited labeled data relaxes the stringent labeled data requirement for supervised machine learning. This report focuses on active learning, a technique that relies on collaboration between machines and humans to label smartly. Active learning reduces the number of labeled examples required to train a model, saving time and money while obtaining comparable performance to models trained with much more data. With active learning, enterprises can leverage their large pool of unlabeled data to open up new product possibilities.

Read the report
Explore the prototype

Fast Forward Labs Learning with Limited Labeled Data

Federated learning

Federated Learning makes it possible to build machine learning systems without direct access to training data. The data remains in its original location, which helps to ensure privacy and reduces communication costs. Federated learning is a great fit for smartphones and edge hardware, healthcare and other privacy-sensitive use cases, and industrial applications such as predictive maintenance.

Read the report
Explore the prototype

Semantic Recommendations

The internet has given us an avalanche of options for what to read, watch and buy. Because of this, recommendation algorithms, which find items that will interest a particular person, are more important than ever. In this report we explore recommendation systems that make use of the semantic content of items and users to deliver richer recommendations across multiple industries.

Read the report

Summarization

This report explores methods for extractive summarization, a capability that allows one to automatically summarize documents. This technique has a wealth of applications: from the ability to distill thousands of product reviews, extract the most important content from long news articles, or automatically cluster customer bios into personas.

Read the report

Deep Learning for Image Analysis - 2019 Edition

Convolutional neural networks (CNNs or ConvNets) excel at learning meaningful representations of features and concepts within images, making CNNs valuable for solving problems in multiple domains, from medical imaging to manufacturing. In this report, we show how to select the right deep learning models for image analysis tasks and techniques for debugging deep learning models.

Read the report
Explore the prototype

Deep Learning: Image Analysis

This report explores the history and current state of deep learning, explains how to apply it, and predicts future developments.

Read the report

Probabilistic Methods for Realtime Streams

Since the days of analog computers built on cams and gears, we’ve been engineering systems around the flow of data and the critical calculations we must perform. While the philosophy of our designs has remained consistent, our engineering constraints are constantly evolving. In the past five years, we’ve seen the emergence of “big data,” or the ability to use commodity infrastructure to analyze very large data sets in a batch. We’re currently in the midst of a significant step forward in the tools, methods, and technologies available for working with real-time streams of data.

Read the report

Subscription-only reports

Updated versions of older reports will be available for free in the future, so check back often.

Multi-task learning

In this report, we focus on multi-task learning, a new approach to machine learning that allows algorithms to master tasks in parallel.

Preview the report

Probabilistic programming

Here, we show how to use probabilistic programming and Bayesian inference to easily build tools that make better predictions for more effective decision making.

Preview the report

Natural language generation

In this report, we look at how machine systems can turn highly structured data into human language narrative.

Preview the report

Read the Fast Forward Labs blog

Keep up with tomorrow

Sign up for our monthly newsletter and get the latest on advances in applied artificial intelligence, as well as company news and events.

Contact us about a research subscription

First Name

Last Name

Job Title

Business Email

Company

Phone

Comments

Yes, I would like to be contacted by Cloudera for newsletters, promotions, events and marketing activities. Please read our privacy and data policy.

Yes, I consent to my information being shared with Cloudera's solution partners to offer related products and services. Please read our privacy and data policy.

I agree to Cloudera's terms and conditions.

Misa Amane

Fast Forward Labs research reports & prototypes

Fast Forward Labs research now available without a subscription

Free research reports

Text Style Transfer

Inferring Concept Drift Without Labeled Data

Exploring Multi-Objective Hyperparameter Optimization

Deep Learning for Automatic Offline Signature Verification

Session-Based Recommender Systems

Few-Shot Text Classification

Structural Time Series

Meta-Learning

Automated Question Answering

Causality for Machine Learning

Interpretability: 2020 Edition

Deep Learning for Anomaly Detection

Transfer Learning for Natural Language Processing

Learning with Limited Labeled Data

Federated learning

Semantic Recommendations

Summarization

Deep Learning for Image Analysis - 2019 Edition

Deep Learning: Image Analysis

Probabilistic Methods for Realtime Streams

Subscription-only reports

Multi-task learning

Probabilistic programming

Natural language generation

Read the Fast Forward Labs blog

Keep up with tomorrow

Contact us about a research subscription

Contact Us

Your form submission has failed.