Hi, my name is João F. Henriques. (Sounds a bit like “joo-au” in English.) I like to work in the convex hull of machine learning, deep learning and computer vision. Perhaps my most well-known works are on visual tracking, but I have many favourite topics: friendly AI, robot mapping, meta-learning, continual learning, self-supervised learning and optimisation.

My talented DPhil students:

Marian Longa · Shu Ishida · Andreea Oncescu · Tim Franzmeyer · Dominik Kloepfer · Yash Bhalgat · Shivani Mall · Lorenza Prospero · Daniil Zverev

(Graduated:  Xu Ji · Mandela Patrick)

Research

Publications, talks and source-code

Filter by topic

Extracting Reward Functions from Diffusion Models

F. Nuti, T. Franzmeyer, J. F. Henriques

arXiv, 2023

PDF arXiv

Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion

Y. Bhalgat, I. Laina, J. F. Henriques, A. Zisserman, A. Vedaldi

arXiv, 2023

PDF arXiv
LoCUS: Learning Multiscale 3D-consistent Features from Posed Images

D. Kloepfer, D. Campbell, J. F. Henriques

ICCV, 2023

PDF Appendix
RbA: Segmenting Unknown Regions Rejected by All

N. Nayal, M. Yavuz, J. F. Henriques, F. Güney

ICCV, 2023

PDF Appendix arXiv
CASSPR: Cross Attention Single Scan Place Recognition

Y. Xia, M. Gladkova, R. Wang, Q. Li, U. Stilla, J. F. Henriques, D. Cremers

ICCV, 2023

PDF Appendix arXiv
A Light Touch Approach to Teaching Transformers Multi-view Geometry

Y. Bhalgat, J. F. Henriques, A. Zisserman

CVPR, 2023

PDF arXiv
Generalised Lookahead Optimiser

C. Oncescu, J. Valmadre, J. F. Henriques

Tiny Papers at ICLR, 2023

PDF
Learn what matters: cross-domain imitation learning with task-relevant embeddings

T. Franzmeyer, P. Torr, J. F. Henriques

Advances in Neural Information Processing Systems, 2022

PDF arXiv
SNeS: Learning Probably Symmetric Neural Surfaces from Incomplete Data

E. Insafutdinov, D. Campbell, J. F. Henriques, A. Vedaldi

ECCV, 2022

We augment neural radiance fields to render views of partially-symmetric objects that are not seen in the data, such as when seeing a car from just one side. Since shadows and reflections break object symmetry, in the process we decompose scenes into geometry, light and material properties.

PDF arXiv
Towards real-world navigation with deep differentiable planners

S. Ishida, J. F. Henriques

CVPR, 2022

We train robot agents to explore and seek semantic goals, without hazardous trial-and-error, by using only safe demonstrations. We achieve this by extending and improving on Value Iteration Networks, enabling robots to cope even with mazes with a high branching factor.

PDF arXiv
Learning altruistic behaviours in reinforcement learning without external rewards

T. Franzmeyer, M. Malinowski, J. F. Henriques

ICLR, 2022

How can autonomous agents help others, including humans, without having exact knowledge of their goals? We explore the concept of increasing others' choice so that they can more easily pursue arbitrary goals, which in some cases even outperforms explicitly cooperative rewards.

PDF arXiv

Audio retrieval with natural language queries: A benchmark study

A. S. Koepke, A. Oncescu, J. Henriques, Z. Akata, S. Albanie

IEEE Transactions on Multimedia, 2022

Illusionary Attacks on Sequential Decision Makers and Countermeasures

T. Franzmeyer, J. F. Henriques, J. N. Foerster, P. H. Torr, A. Bibi, C. S. de Witt

arXiv, 2022

PDF arXiv
Keeping your eye on the ball: Trajectory attention in video transformers

M. Patrick, D. Campbell, Y. M. Asano, I. M. F. Metze, C. Feichtenhofer, A. Vedaldi, J. F. Henriques

NeurIPS, 2021 (oral presentation)

We improve video transformers (e.g. for action recognition) by encouraging attention pooling over motion paths. We also reduce the quadratic computational complexity of attention to linear, with a rigorous probabilistic approximation based on orthogonal prototypes.

PDF arXiv
Multi-modal self-supervision from generalized data transformations

M. Patrick, Y. M. Asano, P. Kuznetsova, R. Fong, J. F. Henriques, G. Zweig, A. Vedaldi

ICCV, 2021

Most contrastive self-supervised methods learn representations that are distinctive to individual examples, and invariant to several other factors. We propose a framework to systematically evaluate valid combinations of distinctive and invariant factors, yielding superior performance in many multi-modal learning tasks.

PDF Code arXiv
Space-time crop & Attend: Improving cross-modal video representation learning

M. Patrick, Y. M. Asano, P. Huang, I. Misra, F. Metze, J. F. Henriques, A. Vedaldi

ICCV, 2021

PDF Code arXiv
Quantised Transforming Auto-Encoders: Achieving equivariance to arbitrary transformations in deep networks

J. Jiao, J. F. Henriques

BMVC, 2021

PDF
Moving SLAM: Fully unsupervised deep learning in non-rigid scenes

D. Xu, A. Vedaldi, J. F. Henriques

IROS, 2021

A self-supervised network that learns to decompose a video into camera motion, depths, object segmentation, and object motion in 6D (translation and rotation). We do this by assuming a locally-rigid world model in every patch of the video.

PDF arXiv
Support-set bottlenecks for video-text representation learning

M. Patrick, P. Huang, Y. Asano, F. Metze, A. G. Hauptmann, J. F. Henriques, A. Vedaldi

ICLR, 2021

We investigate noise-contrastive learning of video-text neural networks. We find that learning to reconstruct video captions with video retrieval as a representational bottleneck yields better semantic representations.

PDF arXiv
Audio retrieval with natural language queries

A. Oncescu, A. S. Koepke, J. F. Henriques, Z. Akata, S. Albanie

Interspeech, 2021 (nominated for best student paper award)

Creating a content-based audio search engine. Similar to Google Images, but for audio instead.

PDF arXiv
QuerYD: A video dataset with high-quality text and audio narrations

A. Oncescu, J. F. Henriques, Y. Liu, A. Zisserman, S. Albanie

ICASSP, 2021

A dataset with more than 70 hours of detailed spoken narrations for 200 hours of varied YouTube videos, from more than 1400 volunteer narrators. Text transcriptions are also included. Based on YouDescribe, the goal is to narrate as many videos as possible for the visually impaired, and to facilitate automatic narrations.

PDF Project page with dataset arXiv
Automatic Recall Machines: Internal replay, continual learning and the brain

X. Ji, J. Henriques, T. Tuytelaars, A. Vedaldi

NeurIPS Workshops, 2020

Avoiding catastrophic forgetting with context-sensitive generative recall, inspired by biological memory.

PDF arXiv
360º camera alignment via segmentation

B. Davidson, M. S. Alvi, J. F. Henriques

ECCV, 2020

PDF
Gradient shape model

P. Martins, J. F. Henriques, J. Batista

IJCV, 2020

PDF
Small steps and giant leaps: Minimal Newton solvers for deep learning

J. F. Henriques, S. Ehrhardt, S. Albanie, A. Vedaldi

ICCV, 2019

We propose CurveBall, a fast second-order optimizer for deep networks that is simple to implement and does not require hyper-parameter tuning.

PDF Optimisers visualisation (GIF) PyTorch code TensorFlow code Matlab code Slides Appendix arXiv
Invariant information clustering for unsupervised image classification and segmentation

X. Ji, J. F. Henriques, A. Vedaldi

ICCV, 2019

A simple-to-implement mutual information objective that trains deep networks to perform clustering from scratch, with no labels and only one hyper-parameter. Experiments include self-supervised clustering and image segmentation.

PDF PyTorch code arXiv
Meta-learning with differentiable closed-form solvers

L. Bertinetto, J. F. Henriques, P. H. S. Torr, A. Vedaldi

ICLR, 2019

We propose a meta-learning deep network that learns to adapt quickly to novel examples, by inserting a ridge regressor (or another classical learner) inside of the network. This is made efficient by using the Woodbury trick, leveraging the fact that sample sizes in one-shot (few-shot) learning problems are small. The result is an extremely fast and accurate one-shot learner, with a simple and hyper-parameter-less implementation.

PDF Project page PyTorch code Poster arXiv
MapNet: An allocentric spatial memory for mapping environments

J. F. Henriques, A. Vedaldi

CVPR, 2018 (oral presentation)

SLAM (Simultaneous Localization And Mapping) is crucial for robotics, but traditional systems cannot improve by learning from data. We propose MapNet, an end-to-end learnable deep network that solves the full SLAM problem, by leveraging efficient operations on a spatial memory.

PDF Blog post Results video (real data) Results video (Doom game) PyTorch code Talk Slides
Long-term tracking in the wild: A benchmark

J. Valmadre, L. Bertinetto, J. F. Henriques, R. Tao, A. Vedaldi, A. W. Smeulders, P. H. Torr, E. Gavves

ECCV, 2018

A tracking benchmark with over 14 hours of video, focusing on diverse footage captured "in the wild" and long-term performance.

PDF Project page with dataset
Warped Convolutions: Efficient invariance to spatial transformations

J. F. Henriques, A. Vedaldi

ICML, 2017 (oral presentation)

Convolutions match patterns across translations. We generalize convolutions (and thus CNNs) to work across scaling, rotation and more, including 3D rotations. We show how this generalization can be done with negligible overhead, by performing a single fixed warp before a standard convolution.

PDF Slides (visual explanation) arXiv (proofs variant using simple calculus)
End-to-end representation learning for correlation filter based tracking

J. Valmadre, L. Bertinetto, J. F. Henriques, A. Vedaldi, P. H. S. Torr

CVPR, 2017

This paper presents the CFNet tracker, a meta-learning network that trains and evaluates a correlation filter learner as part of its forward pass. Winner of the Real-time Visual Object Tracking (VOT) Challenge in 2017.

PDF Project page Code arXiv
ResearchDoom and CocoDoom: Learning computer vision with games

A. Mahendran, H. Bilen, J. F. Henriques, A. Vedaldi

arXiv, 2017

A large dataset in the COCO format with object class/instance bounding boxes and segmentation masks, as well as depth and egomotion, extracted from speedruns of the videogame Doom.

PDF Dataset Code arXiv
Circulant structures in computer vision

J. F. Henriques

PhD thesis, 2016

My thesis, contains a tutorial introduction to circulant matrices and Fourier methods for machine learning.

PDF
Bayesian constrained local models revisited

P. Martins, J. F. Henriques, R. Caseiro, J. Batista

TPAMI, 2016

PDF Video
Learning feed-forward one-shot learners

L. Bertinetto, J. F. Henriques, J. Valmadre, P. H. S. Torr, A. Vedaldi

NeurIPS, 2016

Early work on meta-learning for one-shot learning, where a deep network predicts the parameters of another network, given a few examples of a classification task.

PDF Slides arXiv
Fully-convolutional siamese networks for object tracking

L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, P. H. S. Torr

ECCV Workshops, 2016

The SiamFC tracker, one of the fastest visual trackers based on deep networks. This basic architecture has been used in numerous follow-up works, and commercially-deployed systems.

PDF Project page Code Short talk arXiv
High-speed tracking with kernelized correlation filters

J. F. Henriques, R. Caseiro, P. Martins, J. Batista

TPAMI, 2015

The KCF, an extremely fast visual tracker (hundreds of frames-per-second), especially suited for resource-constrained devices. It relies on the Fast Fourier Transform, with online learning based on the theory of circulant matrices.

PDF Matlab code C++ code (official OpenCV class, single-scale) C++ code (multi-scale) C++ code (streamlined) arXiv
Beyond the shortest path: Unsupervised domain adaptation by sampling subspaces along the spline flow

R. Caseiro, P. Martins, J. F. Henriques, J. Batista

CVPR, 2015

PDF
Fast training of pose detectors in the fourier domain

J. F. Henriques, P. Martins, R. Caseiro, J. Batista

NeurIPS, 2014

PDF Appendix A Appendix B (with code)
Likelihood-enhanced bayesian constrained local models

P. Martins, R. Caseiro, J. F. Henriques, J. Batista

ICIP, 2014 (top 10% of accepted papers)

PDF Video arXiv
Beyond hard negative mining: Efficient detector learning via block-circulant decomposition

J. F. Henriques, J. Carreira, R. Caseiro, J. Batista

ICCV, 2013 (oral presentation)

PDF Code Slides Talk Appendix
Rolling riemannian manifolds to solve the multi-class classification problem

R. Caseiro, P. Martins, J. F. Henriques, J. Carreira, J. Batista

CVPR, 2013 (oral presentation)

PDF
Exploiting the circulant structure of tracking-by-detection with kernels

J. F. Henriques, R. Caseiro, P. Martins, J. Batista

ECCV, 2012

The first tracker based on circulant matrices. A more recent version is the KCF.

PDF Video Matlab code Python code Java code Appendix B Appendix C Poster
Semi-intrinsic mean shift on riemannian manifolds

R. Caseiro, J. F. Henriques, P. Martins, J. Batista

ECCV, 2012

PDF
Discriminative bayesian active shape models

P. Martins, R. Caseiro, J. F. Henriques, J. Batista

ECCV, 2012

PDF Video
Let the shape speak: Face alignment using conjugate priors

P. Martins, R. Caseiro, J. F. Henriques, J. Batista

BMVC, 2012 (oral presentation)

PDF Video
A nonparametric riemannian framework on tensor field with application to foreground segmentation

R. Caseiro, P. Martins, J. F. Henriques, J. Batista

Pattern Recognition, 2012

PDF
A nonparametric riemannian framework on tensor field with application to foreground segmentation

R. Caseiro, J. F. Henriques, P. Martins, J. Batista

ICCV, 2011

PDF
Tracking in streamed video by updating globally optimal matchings

J. F. Henriques, R. Caseiro, J. Batista

ICIP, 2010

PDF
Using directional statistics to learn cast shadows from a multi-spectral light sources physical model

R. Caseiro, J. F. Henriques, J. Batista

ICIP, 2010

PDF

More

Research-related

Workshops on Preregistration

An alternative publication model for machine learning research

Preregistration separates the generation and confirmation of hypotheses:

Come up with an exciting research question
Write a paper proposal without confirmatory experiments
After the paper is accepted, run the experiments and report your results

There are several advantages in this model: 1) A healthy mix of positive and negative results; 2) Reasonable ideas that don’t work still get published, avoiding wasteful replications; 3) Papers are evaluated on the basis of scientific interest, not whether they achieve the best results; 4) It is easier to plan research; and 5) results are statistically stronger. Check the pages below for more information, including talks and preregistered machine learning papers.

OverBoard

A pure Python dashboard for monitoring deep learning experiments

OverBoard is a lightweight yet powerful dashboard to monitor your experiments. It includes:

A table of hyper-parameters with Python-syntax filtering
Multiple views of the same data (i.e. custom X/Y axes)
Hyper-parameter visualisation (i.e. bubble plots)
Percentile intervals for multiple runs (i.e. shaded plots)
Custom visualisations (tensors and any custom plot with familiar MatPlotLib syntax)
Fast client-side rendering (the training code is kept lightweight)
You can install it with: pip install overboard
Its only dependences are PyQtGraph (conda install pyqt pyqtgraph -c anaconda) and Python 3.

Fun

Not mutually-exclusive with research

Large Language Models are Few-shot Publication Scoopers

S. Albanie, L. Momeni, J. F. Henriques

SIGBOVIK, 2023

Our research team jumps on the LLM bandwagon, scoops in hand. This article was 100% produced by free-range humans.

PDF arXiv
A 23 MW data centre is all you need

S. Albanie, D. Campbell, J. F. Henriques

SIGBOVIK, 2022

A deep dive into the murky waters of AI future prediction and the legalities of scaling laws.

PDF arXiv
On the origin of species of self-supervised learning

S. Albanie, E. Lu, J. F. Henriques

SIGBOVIK, 2021

Inspired by the squishy sciences, we embark on a perilous journey to explain today's Cambrian explosion in self-supervised methods.

PDF arXiv
State-of-art-reviewing: A radical proposal to improve scientific publication

S. Albanie, J. Thewmore, R. McCraith, J. F. Henriques

SIGBOVIK, 2020

A way to fast-forward through arXiv submissions, and the artistic value of taping edible objects to a wall (gallery included).

PDF arXiv
Deep industrial espionage

S. Albanie, J. Thewlis, S. Ehrhardt, J. F. Henriques

Narrowly missed SIGBOVIK, 2019

The most end-to-end network ever proposed, and a sunnier alternative to cloud computing. Narrowly missing the deadline for SIGBOVIK 2019, received the Most timely paper award at SIGBOVIK 2020.

PDF arXiv
Substitute teacher networks: Learning with almost no supervision

S. Albanie, J. Thewlis, J. F. Henriques

SIGBOVIK, 2018

Includes a discussion of ruthless edu-tech business tactics, and baking cherry cakes with a fellow whose name rhymes with quelqu'un.

PDF arXiv Code SIGBOVIK Reviews
Stopping GAN violence: Generative unadversarial networks

S. Albanie, S. Ehrhardt, J. F. Henriques

SIGBOVIK, 2017

An attempt to end the madness of pitting network-against-network (GAN training). This paper achieved moderate success on social media, which meant that all subsequent papers were doomed to obscurity (but that didn't stop us).

Surprisingly, there is an entirely serious paper that experiments with generative unadversarial training and credits our joke paper as the inspiration! (With full knowledge that it is not to be taken seriously of course.) Mission accomplished.

PDF arXiv SIGBOVIK Award Code