Skip to content

Latest commit

 

History

History
14054 lines (13536 loc) · 523 KB

year_datasets.md

File metadata and controls

14054 lines (13536 loc) · 523 KB

Home           Papers           Datasets           Metrics           

Home           Year           Application           Task           Annotation           


Datasets by Year

Below is the list of datasets grouped according to their years.Within each group, the datasets are **sorted** based on their **popularity**, (i.e how often they are used in prediction papers).

Each dataset in the list has an associated link to the publication page and/or arxiv preprint if available. By clicking on the dataset you can get the following information:

  • Summary of the dataset's characteristics, e.g. quantity, number of objects or classes, etc.
  • Applications that use the dataset
  • Data type and annotations available in the dataset
  • Task of the dataset, e.g. driving, activity, etc
  • Papers that used the dataset in chronological order
  • Bibtext of the dataset

2020

↑ top
    TITAN link paper arxiv
    • Summary: A dataset of 700 front-view video clips of driving for pedestrian action and trajectory prediction annotated at 10hz
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, Bounding Box, Attibute, Object Class, Tracking ID
    • Task: Driving
      Used in papers
        Malla et al., "TITAN: Future Forecast Using Action Priors", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Malla_2020_CVPR,
              author = "Malla, Srikanth and Dariush, Behzad and Choi, Chiho",
              title = "TITAN: Future Forecast Using Action Priors",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @InProceedings{Malla_2020_CVPR,
          author = "Malla, Srikanth and Dariush, Behzad and Choi, Chiho",
          title = "TITAN: Future Forecast Using Action Priors",
          booktitle = "CVPR",
          year = "2020"
      }
      
    Multiverse link paper arxiv
    • Summary: A dataset of 3K simulated videos of pedestrian trajectory samples from 4 different camera views. Each sample comes with multiple human-annotated possible trajectories.
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, Bounding Box, Semantic Segment, Tracking ID
    • Task: Surveillance (simulation)
      Used in papers
        Liang et al., "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Liang_2020_CVPR_2,
              author = "Liang, Junwei and Jiang, Lu and Murphy, Kevin and Yu, Ting and Hauptmann, Alexander",
              title = "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @InProceedings{Liang_2020_CVPR_2,
          author = "Liang, Junwei and Jiang, Lu and Murphy, Kevin and Yu, Ting and Hauptmann, Alexander",
          title = "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction",
          booktitle = "CVPR",
          year = "2020"
      }
      
    Oops! link paper arxiv
    • Summary: A dataset of 20K+ video clips of failed actions including physical and social errors, errors in planning and execution, etc
    • Applications: Action prediction
    • Data type and annotations: RGB, Activity Label, Temporal Segment
    • Task: Activity
      Used in papers
        Epstein et al., "Oops! Predicting Unintentional Action in Video", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Epstein_2020_CVPR,
              author = "Epstein, Dave and Chen, Boyuan and Vondrick, Carl",
              title = "Oops! Predicting Unintentional Action in Video",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @InProceedings{Epstein_2020_CVPR,
          author = "Epstein, Dave and Chen, Boyuan and Vondrick, Carl",
          title = "Oops! Predicting Unintentional Action in Video",
          booktitle = "CVPR",
          year = "2020"
      }
      
    FaceScape link paper arxiv
    • Summary: A dataset of 900+ faces and corresponding multi-view 3D meshes
    • Applications: Other prediction
    • Data type and annotations: RGB, 3D Model, Landmarks
    • Task: Face
      Used in papers
        Yang et al., "FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Yang_2020_CVPR,
              author = "Yang, Haotian and Zhu, Hao and Wang, Yanru and Huang, Mingkai and Shen, Qiu and Yang, Ruigang and Cao, Xun",
              title = "FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @InProceedings{Yang_2020_CVPR,
          author = "Yang, Haotian and Zhu, Hao and Wang, Yanru and Huang, Mingkai and Shen, Qiu and Yang, Ruigang and Cao, Xun",
          title = "FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction",
          booktitle = "CVPR",
          year = "2020"
      }
      
    Citywalks link paper arxiv
    • Summary: A dataset of 500+ front-view sequences of pedestrian trajectories annotated at 30fps
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, Bounding Box, Attribute, Tracking ID
    • Task: Walking
      Used in papers
        Styles et al., "Multiple Object Forecasting: Predicting Future Object Locations in Diverse Environments", WACV, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Styles_2020_WACV,
              author = "Styles, Oliver and Sanchez, Victor and Guha, Tanaya",
              title = "Multiple Object Forecasting: Predicting Future Object Locations in Diverse Environments",
              booktitle = "WACV",
              year = "2020"
          }
          
      Bibtex
      @InProceedings{Styles_2020_WACV,
          author = "Styles, Oliver and Sanchez, Victor and Guha, Tanaya",
          title = "Multiple Object Forecasting: Predicting Future Object Locations in Diverse Environments",
          booktitle = "WACV",
          year = "2020"
      }
      
    Freiburg Imra Testing (FIT) link paper arxiv
    • Summary: A small-scale driving dataset with automatically generated bounding box track annotations
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, Bounding Box, Object Class, Semantic Segment
    • Task: Driving
      Used in papers
        Makansi et al., "Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Makansi_2020_CVPR,
              author = "Makansi, Osama and Cicek, Ozgun and Buchicchio, Kevin and Brox, Thomas",
              title = "Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @InProceedings{Makansi_2020_CVPR,
          author = "Makansi, Osama and Cicek, Ozgun and Buchicchio, Kevin and Brox, Thomas",
          title = "Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior",
          booktitle = "CVPR",
          year = "2020"
      }
      
    Lyft link
    • Summary: A driving dataset with 1M+ 3D annotations for perception and 1K+ driving sequences for prediction annotated at 2hz
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, LIDAR, 3D Bounding Box, Object Class, Attribute, Map, Tracking ID
    • Task: Driving
      Used in papers
        Zhang et al., "STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zhang_2020_CVPR,
              author = "Zhang, Zhishuai and Gao, Jiyang and Mao, Junhua and Liu, Yukai and Anguelov, Dragomir and Li, Congcong",
              title = "STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @Misc{Lyft_2020,
          author = "Lyft",
          title = "Lyft Level 5 AV Dataset",
          howpublished = "https://self-driving.lyft.com/level5/data/",
          year = "2020"
      }
      

2019

↑ top
    nuScenes link paper arxiv
    • Summary: A dataset with 1K driving sequences and 1M+ 3D bounding boxes annotated at 2hz
    • Applications:
    • Data type and annotations: RGB, LIDAR, 3D Bounding Box, Object Class, Attribute, Map, Tracking ID
    • Task: Driving
      Used in papers
        Liang et al., "PnPNet: End-to-End Perception and Prediction With Tracking in the Loop", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Liang_2020_CVPR,
              author = "Liang, Ming and Yang, Bin and Zeng, Wenyuan and Chen, Yun and Hu, Rui and Casas, Sergio and Urtasun, Raquel",
              title = "PnPNet: End-to-End Perception and Prediction With Tracking in the Loop",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Makansi et al., "Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Makansi_2020_CVPR,
              author = "Makansi, Osama and Cicek, Ozgun and Buchicchio, Kevin and Brox, Thomas",
              title = "Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Phan-Minh et al., "CoverNet: Multimodal Behavior Prediction Using Trajectory Sets", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Phan-Minh_2020_CVPR,
              author = "Phan-Minh, Tung and Grigore, Elena Corina and Boulton, Freddy A. and Beijbom, Oscar and Wolff, Eric M.",
              title = "CoverNet: Multimodal Behavior Prediction Using Trajectory Sets",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Wu et al., "MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Wu_2020_CVPR_2,
              author = "Wu, Pengxiang and Chen, Siheng and Metaxas, Dimitris N.",
              title = "MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Rhinehart et al., "Precog: Prediction Conditioned On Goals In Visual Multi-Agent Settings", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Rhinehart_2019_ICCV,
              author = "Rhinehart, Nicholas and McAllister, Rowan and Kitani, Kris and Levine, Sergey",
              title = "Precog: Prediction Conditioned On Goals In Visual Multi-Agent Settings",
              booktitle = "ICCV",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Caesar_2020_CVPR,
          author = "Caesar, Holger and Bankiti, Varun and Lang, Alex H. and Vora, Sourabh and Liong, Venice Erin and Xu, Qiang and Krishnan, Anush and Pan, Yu and Baldan, Giancarlo and Beijbom, Oscar",
          title = "nuScenes: A Multimodal Dataset for Autonomous Driving",
          booktitle = "Proceedings of the CVPR",
          year = "2020"
      }
      
    CARLA link paper arxiv
    • Summary: A dataset of 900 simulated driving segments for multi-agent trajectory forecasting and planning
    • Applications: Action prediction
    • Data type and annotations: RGB
    • Task: Driving (simulation)
      Used in papers
        Rhinehart et al., "Precog: Prediction Conditioned On Goals In Visual Multi-Agent Settings", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Rhinehart_2019_ICCV,
              author = "Rhinehart, Nicholas and McAllister, Rowan and Kitani, Kris and Levine, Sergey",
              title = "Precog: Prediction Conditioned On Goals In Visual Multi-Agent Settings",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Ding et al., "Online Vehicle Trajectory Prediction Using Policy Anticipation Network And Optimization-Based Context Reasoning", ICRA, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Ding_2019_ICRA_2,
              author = "Ding, W. and Shen, S.",
              booktitle = "ICRA",
              title = "Online Vehicle Trajectory Prediction Using Policy Anticipation Network And Optimization-Based Context Reasoning",
              year = "2019"
          }
          
        Oh et al., "HCNAF: Hyper-Conditioned Neural Autoregressive Flow and its Application for Probabilistic Occupancy Map Forecasting", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Oh_2020_CVPR,
              author = "Oh, Geunseob and Valois, Jean-Sebastien",
              title = "HCNAF: Hyper-Conditioned Neural Autoregressive Flow and its Application for Probabilistic Occupancy Map Forecasting",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @InProceedings{Rhinehart_2019_ICCV,
          author = "Rhinehart, Nicholas and McAllister, Rowan and Kitani, Kris and Levine, Sergey",
          title = "Precog: Prediction Conditioned On Goals In Visual Multi-Agent Settings",
          booktitle = "ICCV",
          year = "2019"
      }
      
    Waymo Open Dataset (WOD) link paper arxiv
    • Summary: A dataset with approx. 2K driving segments and 20M+ 2D/3D bounding boxes annotated at 10hz
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, LIDAR, Bounding Box, 3D Bounding Box, Object Class, Tracking ID
    • Task: Driving
      Used in papers
        Makansi et al., "Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Makansi_2020_CVPR,
              author = "Makansi, Osama and Cicek, Ozgun and Buchicchio, Kevin and Brox, Thomas",
              title = "Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Mohamed et al., "Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Mohamed_2020_CVPR,
              author = "Mohamed, Abduallah and Qian, Kun and Elhoseiny, Mohamed and Claudel, Christian",
              title = "Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Zhang et al., "STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zhang_2020_CVPR,
              author = "Zhang, Zhishuai and Gao, Jiyang and Mao, Junhua and Liu, Yukai and Anguelov, Dragomir and Li, Congcong",
              title = "STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @InProceedings{Sun_2020_CVPR_3,
          author = "Sun, Pei and Kretzschmar, Henrik and Dotiwalla, Xerxes and Chouard, Aurelien and Patnaik, Vijaysai and Tsui, Paul and Guo, James and Zhou, Yin and Chai, Yuning and Caine, Benjamin and Vasudevan, Vijay and Han, Wei and Ngiam, Jiquan and Zhao, Hang and Timofeev, Aleksei and Ettinger, Scott and Krivokon, Maxim and Gao, Amy and Joshi, Aditya and Zhang, Yu and Shlens, Jonathon and Chen, Zhifeng and Anguelov, Dragomir",
          title = "Scalability in Perception for Autonomous Driving: Waymo Open Dataset",
          booktitle = "CVPR",
          year = "2020"
      }
      
    Pedestrian Intention Estimation (PIE) link paper
    • Summary: A dataset of driving sequences with more than 6 hours of footage with 1.8K+ pedestrian tracks and 2.3M+ relevant traffic object bounding boxes annotated at 30fps
    • Applications: Action prediction, Trajectory prediction
    • Data type and annotations: RGB, bounding box, object class, attribute, temporal segment, vehicle sensors, Tracking ID
    • Task: Driving
      Used in papers
        Rasouli et al., "Pedestrian Action Anticipation Using Contextual Feature Fusion In Stacked Rnns", BMVC, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Rasouli_2019_BMVC,
              author = "Rasouli, Amir and Kotseruba, Iuliia and Tsotsos, John K",
              title = "Pedestrian Action Anticipation Using Contextual Feature Fusion In Stacked Rnns",
              year = "2019",
              booktitle = "BMVC"
          }
          
        Rasouli et al., "Pie: A Large-Scale Dataset And Models For Pedestrian Intention Estimation And Trajectory Prediction", ICCV, 2019. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Rasouli_2019_ICCV,
              author = "Rasouli, Amir and Kotseruba, Iuliia and Kunic, Toni and Tsotsos, John K.",
              title = "Pie: A Large-Scale Dataset And Models For Pedestrian Intention Estimation And Trajectory Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Rasouli_2019_ICCV,
          author = "Rasouli, Amir and Kotseruba, Iuliia and Kunic, Toni and Tsotsos, John K.",
          title = "Pie: A Large-Scale Dataset And Models For Pedestrian Intention Estimation And Trajectory Prediction",
          booktitle = "ICCV",
          year = "2019"
      }
      
    Argoverse link paper arxiv
    • Summary: A dataset with 100+ driving segments and 10K+ 3D bounding boxes for tracking and 300K+ segments for forecasting annotated at 10hz
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, LIDAR, 3D bounding box, Map
    • Task: Driving
      Used in papers
        Fang et al., "TPNet: Trajectory Proposal Network for Motion Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Fang_2020_CVPR,
              author = "Fang, Liangji and Jiang, Qinhong and Shi, Jianping and Zhou, Bolei",
              title = "TPNet: Trajectory Proposal Network for Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Chang et al., "Argoverse: 3D Tracking And Forecasting With Rich Maps", CVPR, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Chang_2019_CVPR,
              author = "Chang, Ming-Fang and Lambert, John and Sangkloy, Patsorn and Singh, Jagjeet and Bak, Slawomir and Hartnett, Andrew and Wang, De and Carr, Peter and Lucey, Simon and Ramanan, Deva and Hays, James",
              title = "Argoverse: 3D Tracking And Forecasting With Rich Maps",
              booktitle = "CVPR",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Chang_2019_CVPR,
          author = "Chang, Ming-Fang and Lambert, John and Sangkloy, Patsorn and Singh, Jagjeet and Bak, Slawomir and Hartnett, Andrew and Wang, De and Carr, Peter and Lucey, Simon and Ramanan, Deva and Hays, James",
          title = "Argoverse: 3D Tracking And Forecasting With Rich Maps",
          booktitle = "CVPR",
          year = "2019"
      }
      
    TRAF link paper arxiv
    • Summary: A dataset of 50 driving sequences with mix front and top-down view clips annotated at 30fps
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, bounding box, object class, time-of-day, Tracking ID
    • Task: Driving
      Used in papers
        Chandra et al., "Traphic: Trajectory Prediction In Dense And Heterogeneous Traffic Using Weighted Interactions", CVPR, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Chandra_2019_CVPR,
              author = "Chandra, Rohan and Bhattacharya, Uttaran and Bera, Aniket and Manocha, Dinesh",
              title = "Traphic: Trajectory Prediction In Dense And Heterogeneous Traffic Using Weighted Interactions",
              booktitle = "CVPR",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Chandra_2019_CVPR,
          author = "Chandra, Rohan and Bhattacharya, Uttaran and Bera, Aniket and Manocha, Dinesh",
          title = "Traphic: Trajectory Prediction In Dense And Heterogeneous Traffic Using Weighted Interactions",
          booktitle = "CVPR",
          year = "2019"
      }
      
    MGIF link paper arxiv
    • Summary: A dataset of cartoon animal animations for future video prediction
    • Applications: Video prediction
    • Data type and annotations: RGB
    • Task: Activity
      Used in papers
        Kim et al., "Unsupervised Keypoint Learning For Guiding Class-Conditional Video Prediction", NeurIPS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Kim_2019_NeurIPS,
              author = "Kim, Yunji and Nam, Seonghyeon and Cho, In and Kim, Seon Joo",
              title = "Unsupervised Keypoint Learning For Guiding Class-Conditional Video Prediction",
              booktitle = "NeurIPS",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Siarohin_2019_CVPR,
          author = "Siarohin, Aliaksandr and Lathuilière, Stéphane and Tulyakov, Sergey and Ricci, Elisa and Sebe, Nicu",
          title = "Animating Arbitrary Objects Via Deep Motion Transfer",
          booktitle = "CVPR",
          year = "2019"
      }
      
    Luggage link paper arxiv
    • Summary: A dataset of 13K indoor video clips each showing trajectories of persons ending in close proximity (near collision) with the camera mounted on a mobile suitcase-shaped platform
    • Applications: Action prediction
    • Data type and annotations: Stereo RGB, bounding box
    • Task: Robot
      Used in papers
        Manglik et al., "Forecasting Time-To-Collision From Monocular Video: Feasibility, Dataset, And Challenges", IROS, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Manglik_2019_IROS,
              author = "Manglik, Aashi and Weng, Xinshuo and Ohn-Bar, Eshed and Kitani, Kris M",
              booktitle = "IROS",
              title = "Forecasting Time-To-Collision From Monocular Video: Feasibility, Dataset, And Challenges",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Manglik_2019_IROS,
          author = "Manglik, Aashi and Weng, Xinshuo and Ohn-Bar, Eshed and Kitani, Kris M",
          booktitle = "IROS",
          title = "Forecasting Time-To-Collision From Monocular Video: Feasibility, Dataset, And Challenges",
          year = "2019"
      }
      
    INTERACTION link arxiv
    • Summary: A naturalistic dataset of motions of various traffic road users in a variety of interactive driving scenarios for behavior modeling and prediction
    • Applications: Trajectory prediction
    • Data type and annotations: Map, trajectory
    • Task: Driving
      Used in papers
        Li et al., "Conditional Generative Neural System For Probabilistic Trajectory Prediction", IROS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2019_IROS,
              author = "Li, Jiachen and Ma, Hengbo and Tomizuka, Masayoshi",
              booktitle = "IROS",
              title = "Conditional Generative Neural System For Probabilistic Trajectory Prediction",
              year = "2019"
          }
          
      Bibtex
      @Article{Zhan_2019_arxiv,
          author = "Zhan, Wei and Sun, Liting and Wang, Di and Shi, Haojie and Clausse, Aubrey and Naumann, Maximilian and Kummerle, Julius and Konigshof, Hendrik and Stiller, Christoph and de La Fortelle, Arnaud and others",
          title = "Interaction Dataset: An International, Adversarial And Cooperative Motion Dataset In Interactive Driving Scenarios With Semantic Maps",
          journal = "arXiv:1910.03088",
          year = "2019"
      }
      
    InstaVariety link paper arxiv
    • Summary: A dataset with 28 hours of video footage and corresponding auto-generated 2D poses
    • Applications: Motion prediction
    • Data type and annotations: RGB, Bounding Box, Pose
    • Task: Activity
      Used in papers
        Zhang et al., "Predicting 3D Human Dynamics From Video", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhang_2019_ICCV,
              author = "Zhang, Jason Y. and Felsen, Panna and Kanazawa, Angjoo and Malik, Jitendra",
              title = "Predicting 3D Human Dynamics From Video",
              booktitle = "ICCV",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Kanazawa_2019_CVPR,
          author = "Kanazawa, Angjoo and Zhang, Jason Y. and Felsen, Panna and Malik, Jitendra",
          title = "Learning 3D Human Dynamics From Video",
          booktitle = "CVPR",
          year = "2019"
      }
      
    Future Motion (FM) link paper
    • Summary: A dataset of instance-level motions in still images containing 11K+ pedestrian instances along with quantized motion directions and auto-generated bounding boxes
    • Applications: Trajectory prediction
    • Data type and annotations: RGB (image), bounding box, activity label, motion direction, speed
    • Task: Mix
      Used in papers
        Kim et al., "Instance-Level Future Motion Estimation In A Single Image Based On Ordinal Regression", ICCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Kim_2019_ICCV,
              author = "Kim, Kyung-Rae and Choi, Whan and Koh, Yeong Jun and Jeong, Seong-Gyun and Kim, Chang-Su",
              title = "Instance-Level Future Motion Estimation In A Single Image Based On Ordinal Regression",
              booktitle = "ICCV",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Kim_2019_ICCV,
          author = "Kim, Kyung-Rae and Choi, Whan and Koh, Yeong Jun and Jeong, Seong-Gyun and Kim, Chang-Su",
          title = "Instance-Level Future Motion Estimation In A Single Image Based On Ordinal Regression",
          booktitle = "ICCV",
          year = "2019"
      }
      
    EgoPose link paper arxiv
    • Summary: A dataset of walking human video clips from the front and egocentric views with the corresponding 3D poses
    • Applications: Motion prediction
    • Data type and annotations: RGB, 3D pose
    • Task: Pose (egocentric)
      Used in papers
        Yuan et al., "Ego-Pose Estimation And Forecasting As Real-Time Pd Control", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Yuan_2019_ICCV,
              author = "Yuan, Ye and Kitani, Kris",
              title = "Ego-Pose Estimation And Forecasting As Real-Time Pd Control",
              booktitle = "ICCV",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Yuan_2019_ICCV,
          author = "Yuan, Ye and Kitani, Kris",
          title = "Ego-Pose Estimation And Forecasting As Real-Time Pd Control",
          booktitle = "ICCV",
          year = "2019"
      }
      
    PEMS-SF link
    • Summary: A dataset with over 15 months of lane occupancy rate (0 to 1) information for select freeways in California
    • Applications: Trajectory prediction
    • Data type and annotations: Lane Occupancy Rate
    • Task: Driving
      Used in papers
        Qi et al., "Imitative Non-Autoregressive Modeling for Trajectory Forecasting and Imputation", CVPR, 2020. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Qi_2020_CVPR,
              author = "Qi, Mengshi and Qin, Jie and Wu, Yu and Yang, Yi",
              title = "Imitative Non-Autoregressive Modeling for Trajectory Forecasting and Imputation",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @Misc{Dua_2019,
          author = "Dua, Dheeru and Graff, Casey",
          year = "2017",
          title = "UCI Machine Learning Repository",
          url = "http://archive.ics.uci.edu/ml",
          institution = "University of California, Irvine, School of Information and Computer Sciences"
      }
      

2018

↑ top
    Epic-Kitchens link paper arxiv
    • Summary: An egocentric cooking action dataset with 55 hours of recording at 60fps with corresponding audio recording and 40K action segments
    • Applications: Action prediction
    • Data type and annotations: RGB, audio, bounding box, object class, text, temporal segment
    • Task: Cooking (egocentric)
      Used in papers
        Ke et al., "Time-Conditioned Action Anticipation In One Shot", CVPR, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Ke_2019_CVPR,
              author = "Ke, Qiuhong and Fritz, Mario and Schiele, Bernt",
              title = "Time-Conditioned Action Anticipation In One Shot",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Furnari et al., "What Would You Expect? Anticipating Egocentric Actions With Rolling-Unrolling Lstms And Modality Attention", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Furnari_2019_ICCV,
              author = "Furnari, Antonino and Farinella, Giovanni Maria",
              title = "What Would You Expect? Anticipating Egocentric Actions With Rolling-Unrolling Lstms And Modality Attention",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Furnari et al., "Egocentric Action Anticipation By Disentangling Encoding And Inference", ICIP, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Furnari_2019_ICIP,
              author = "Furnari, A. and Farinella, G. M.",
              booktitle = "ICIP",
              title = "Egocentric Action Anticipation By Disentangling Encoding And Inference",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Damen_2018_ECCV,
          author = "Damen, Dima and Doughty, Hazel and Farinella, Giovanni Maria and Fidler, Sanja and Furnari, Antonino and Kazakos, Evangelos and Moltisanti, Davide and Munro, Jonathan and Perrett, Toby and Price, Will and Wray, Michael",
          title = "Scaling Egocentric Vision: The Epic-Kitchens Dataset",
          booktitle = "ECCV",
          year = "2018"
      }
      
    VIRAT/ActEV link paper
    • Summary: A dataset of multiview surveillance sequences for trajectory prediction and activity detection of 38 outdoor common activities
    • Applications: Action prediction, Trajectory prediction
    • Data type and annotations: RGB, bounding box, activity label, temporal segment
    • Task: Surveillance
      Used in papers
        Liang et al., "Peeking Into The Future: Predicting Future Person Activities And Locations In Videos", CVPR, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Liang_2019_CVPR,
              author = "Liang, Junwei and Jiang, Lu and Niebles, Juan Carlos and Hauptmann, Alexander G. and Fei-Fei, Li",
              title = "Peeking Into The Future: Predicting Future Person Activities And Locations In Videos",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Liang et al., "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Liang_2020_CVPR_2,
              author = "Liang, Junwei and Jiang, Lu and Murphy, Kevin and Yu, Ting and Hauptmann, Alexander",
              title = "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @InProceedings{Awad_2018_Trecvid,
          author = "Awad, George and Butt, Asad and Curtis, Keith and Lee, Yooyoung and Fiscus, Jonathan and Godil, Afzad and Joy, David and Delgado, Andrew and Smeaton, Alan and Graham, Yvette and others",
          title = "Benchmarking Video Activity Detection, Video Captioning And Matching, Video Storytelling Linking And Video Search",
          booktitle = "TRECVID",
          year = "2018"
      }
      
    3D POSES IN THE WILD (3DPW) link paper
    • Summary: A dataset of 60 video sequences with 2D poses and 3D body models
    • Applications: Motion prediction
    • Data type and annotations: RGB, 2D/3D pose, models
    • Task: Outdoor
      Used in papers
        Cui et al., "Learning Dynamic Relationships for 3D Human Motion Prediction", CVPR, 2020. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Cui_2020_CVPR,
              author = "Cui, Qiongjie and Sun, Huaijiang and Yang, Fei",
              title = "Learning Dynamic Relationships for 3D Human Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Mao et al., "Learning Trajectory Dependencies For Human Motion Prediction", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Mao_2019_ICCV,
              author = "Mao, Wei and Liu, Miaomiao and Salzmann, Mathieu and Li, Hongdong",
              title = "Learning Trajectory Dependencies For Human Motion Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{vonMarcard_2018_ECCV,
          author = "von Marcard, Timo and Henschel, Roberto and Black, Michael and Rosenhahn, Bodo and Pons-Moll, Gerard",
          title = "Recovering Accurate 3D Human Pose In The Wild Using Imus And A Moving Camera",
          booktitle = "ECCV",
          year = "2018"
      }
      
    Basketball Tracking Dataset (BTD) link paper
    • Summary: A dataset of basketball players’ trajectories for 2015-16 NBA games
    • Applications: Trajectory prediction
    • Data type and annotations: Trajectory
    • Task: Sport
      Used in papers
        Qi et al., "Imitative Non-Autoregressive Modeling for Trajectory Forecasting and Imputation", CVPR, 2020. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Qi_2020_CVPR,
              author = "Qi, Mengshi and Qin, Jie and Wu, Yu and Yang, Yi",
              title = "Imitative Non-Autoregressive Modeling for Trajectory Forecasting and Imputation",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Felsen et al., "Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion Using Conditional Variational Autoencoders", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Felsen_2018_ECCV,
              author = "Felsen, Panna and Lucey, Patrick and Ganguly, Sujoy",
              title = "Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion Using Conditional Variational Autoencoders",
              booktitle = "ECCV",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Felsen_2018_ECCV,
          author = "Felsen, Panna and Lucey, Patrick and Ganguly, Sujoy",
          title = "Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion Using Conditional Variational Autoencoders",
          booktitle = "ECCV",
          year = "2018"
      }
      
    YouCook2 link paper arxiv
    • Summary: A dataset consists of 2K videos of cooking 89 recipes with corresponding English descriptions
    • Applications: Action prediction
    • Data type and annotations: RGB, audio, text, activity label, temporal segment
    • Task: Cooking
      Used in papers
        Sener et al., "Zero-Shot Anticipation For Instructional Activities", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sener_2019_ICCV,
              author = "Sener, Fadime and Yao, Angela",
              title = "Zero-Shot Anticipation For Instructional Activities",
              booktitle = "ICCV",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Zhou_2018_AI,
          author = "Zhou, Luowei and Xu, Chenliang and Corso, Jason J",
          title = "Towards Automatic Learning Of Procedures From Web Instructional Videos",
          booktitle = "AI",
          year = "2018"
      }
      
    VIENA link paper arxiv
    • Summary: A virtual driving dataset for action anticipation containing 5 different driving scenarios and 25 action classes
    • Applications: Action prediction
    • Data type and annotations: RGB, activity label, vehicle sensors
    • Task: Driving (simulation)
      Used in papers
        Aliakbarian et al., "Viena: A Driving Anticipation Dataset", ACCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Aliakbarian_2018_ACCV,
              author = "Aliakbarian, Mohammad Sadegh and Saleh, Fatemeh Sadat and Salzmann, Mathieu and Fernando, Basura and Petersson, Lars and Andersson, Lars",
              editor = "Jawahar, C. V. and Li, Hongdong and Mori, Greg and Schindler, Konrad",
              title = "Viena: A Driving Anticipation Dataset",
              booktitle = "ACCV",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Aliakbarian_2018_ACCV,
          author = "Aliakbarian, Mohammad Sadegh and Saleh, Fatemeh Sadat and Salzmann, Mathieu and Fernando, Basura and Petersson, Lars and Andersson, Lars",
          editor = "Jawahar, C. V. and Li, Hongdong and Mori, Greg and Schindler, Konrad",
          title = "Viena: A Driving Anticipation Dataset",
          booktitle = "ACCV",
          year = "2019"
      }
      
    ShapeStack link paper arxiv
    • Summary: A simulated dataset of 20K stack configurations composed of a variety of elementary geometric primitives annotated with semantics and structural stability
    • Applications: Video prediction
    • Data type and annotations: RGBD, mask, stability
    • Task: Object (simulation)
      Used in papers
        Ye et al., "Compositional Video Prediction", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Ye_2019_ICCV,
              author = "Ye, Yufei and Singh, Maneesh and Gupta, Abhinav and Tulsiani, Shubham",
              title = "Compositional Video Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Groth_2018_arxiv,
          author = "Groth, Oliver and Fuchs, Fabian B and Posner, Ingmar and Vedaldi, Andrea",
          title = "Shapestacks: Learning Vision-Based Physical Intuition For Generalised Object Stacking",
          booktitle = "ECCV",
          year = "2018"
      }
      
    ShanghaiTech Campus (STC) link paper arxiv
    • Summary: An anomaly detection dataset with 300K+ frames surveillance footage with 130 abnormal events in 13 scenes
    • Applications: Video prediction
    • Data type and annotations: RGB, anomaly
    • Task: Surveillance, Anomaly
      Used in papers
      Bibtex
      @InProceedings{Liu_2018_CVPR,
          author = "Liu, Wen and Luo, Weixin and Lian, Dongze and Gao, Shenghua",
          title = "Future Frame Prediction For Anomaly Detection – A New Baseline",
          booktitle = "CVPR",
          year = "2018"
      }
      
    Extended Georgia Tech Egocentric Activity Gaze+ (EGTEA Gaze+) link paper arxiv
    • Summary: An egocentric cooking action dataset with 28 hours of recording with 86 unique sessions of 32 subjects with framerate of 30hz
    • Applications: Action prediction
    • Data type and annotations: RGB, gaze, mask, activity label, temporal segment
    • Task: Cooking (egocentric)
      Used in papers
        Furnari et al., "What Would You Expect? Anticipating Egocentric Actions With Rolling-Unrolling Lstms And Modality Attention", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Furnari_2019_ICCV,
              author = "Furnari, Antonino and Farinella, Giovanni Maria",
              title = "What Would You Expect? Anticipating Egocentric Actions With Rolling-Unrolling Lstms And Modality Attention",
              booktitle = "ICCV",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Li_2018_ECCV_2,
          author = "Li, Yin and Liu, Miao and Rehg, James M",
          title = "In The Eye Of Beholder: Joint Learning Of Gaze And Actions In First Person Video",
          booktitle = "ECCV",
          year = "2018"
      }
      
    Atomic Visual Actions (AVA) link paper arxiv
    • Summary: An action dataset of 80 atomic visual actions in 430 videos with 1.62M corresponding labels localized in space and time
    • Applications: Action prediction
    • Data type and annotations: RGB, activity label, temporal segment
    • Task: Activity
      Used in papers
        Sun et al., "Relational Action Forecasting", CVPR, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sun_2019_CVPR,
              author = "Sun, Chen and Shrivastava, Abhinav and Vondrick, Carl and Sukthankar, Rahul and Murphy, Kevin and Schmid, Cordelia",
              title = "Relational Action Forecasting",
              booktitle = "CVPR",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Gu_2018_CVPR,
          author = "Gu, Chunhui and Sun, Chen and Ross, David A and Vondrick, Carl and Pantofaru, Caroline and Li, Yeqing and Vijayanarasimhan, Sudheendra and Toderici, George and Ricco, Susanna and Sukthankar, Rahul and others",
          title = "Ava: A Video Dataset Of Spatio-Temporally Localized Atomic Visual Actions",
          booktitle = "CVPR",
          year = "2018"
      }
      
    ACTICIPATE link paper arxiv
    • Summary: A collection of datasets for human-robot interaction involving object handover between humans and human-robots
    • Applications: Action prediction
    • Data type and annotations: RGB, gaze, pose
    • Task: Interaction
      Used in papers
        Schydlo et al., "Anticipation In Human-Robot Cooperation: A Recurrent Neural Network Approach For Multiple Action Sequences Prediction", ICRA, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Schydlo_2018_ICRA_2,
              author = "Schydlo, P. and Rakovic, M. and Jamone, L. and Santos-Victor, J.",
              booktitle = "ICRA",
              title = "Anticipation In Human-Robot Cooperation: A Recurrent Neural Network Approach For Multiple Action Sequences Prediction",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Schydlo_2018_ICRA,
          author = "Schydlo, Paul and Rakovic, Mirko and Jamone, Lorenzo and Santos-Victor, Jos{\'e}",
          title = "Anticipation In Human-Robot Cooperation: A Recurrent Neural Network Approach For Multiple Action Sequences Prediction",
          booktitle = "ICRA",
          year = "2018"
      }
      
    Sea Surface Temperature (SST) link paper arxiv
    • Summary: A dataset containing monthly report of ocean forecast information including temperature, salinity, currents, sea level, etc.
    • Applications: Video prediction
    • Data type and annotations: Map, Temperature
    • Task: Simulation
      Used in papers
        Le et al., "Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Guen_2020_CVPR,
              author = "Le Guen, Vincent and Thome, Nicolas",
              title = "Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @inproceedings{Bezenac_2018_ICLR,
          author = "de Bezenac, Emmanuel and Pajot, Arthur and Gallinari, Patrick",
          title = "Deep Learning for Physical Processes: Incorporating Prior Scientific Knowledge",
          booktitle = "ICLR",
          year = "2018"
      }
      
    ApolloScape link paper arxiv
    • Summary: A driving dataset of 5K vehicle instances, 110K lane segments and 100 mins of sequences for trajectory prediction and tracking annotated at 2hz
    • Applications: Trajectory prediction
    • Data type and annotations: Stereo RGB, LIDAR, 3D Bounding Box, Object Class, Semantic Segment, Tracking ID
    • Task: Driving
      Used in papers
        Fang et al., "TPNet: Trajectory Proposal Network for Motion Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Fang_2020_CVPR,
              author = "Fang, Liangji and Jiang, Qinhong and Shi, Jianping and Zhou, Bolei",
              title = "TPNet: Trajectory Proposal Network for Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @article{Wang_2019_PAMI,
          author = "Wang, Peng and Huang, Xinyu and Cheng, Xinjing and Zhou, Dingfu and Geng, Qichuan and Yang, Ruigang",
          title = "The Apolloscape Open Dataset for Autonomous Driving and its Application",
          journal = "PAMI",
          year = "2019"
      }
      

2017

↑ top
    Joint Attention in Autonomous Driving (JAAD) link paper
    • Summary: A dataset of pedestrians with 346 video sequences showing pedestrians at the time of crossing in different geographical locations and under different weather conditions
    • Applications: Video prediction, Action prediction, Trajectory prediction, Other prediction
    • Data type and annotations: RGB, bounding box, attribute, temporal segment, Tracking ID
    • Task: Driving
      Used in papers
        Chaabane et al., "Looking Ahead: Anticipating Pedestrians Crossing with Future Frames Prediction", WACV, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Chaabane_2020_WACV,
              author = "Chaabane, Mohamed and Trabelsi, Ameni and Blanchard, Nathaniel and Beveridge, Ross",
              title = "Looking Ahead: Anticipating Pedestrians Crossing with Future Frames Prediction",
              booktitle = "WACV",
              year = "2020"
          }
          
        Gujjar et al., "Classifying Pedestrian Actions In Advance Using Predicted Video Of Urban Driving Scenes", ICRA, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Gujjar_2019_ICRA,
              author = "Gujjar, P. and Vaughan, R.",
              booktitle = "ICRA",
              title = "Classifying Pedestrian Actions In Advance Using Predicted Video Of Urban Driving Scenes",
              year = "2019"
          }
          
        Saleh et al., "Real-Time Intent Prediction Of Pedestrians For Autonomous Ground Vehicles Via Spatio-Temporal Densenet", ICRA, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Saleh_2019_ICRA,
              author = "Saleh, K. and Hossny, M. and Nahavandi, S.",
              booktitle = "ICRA",
              title = "Real-Time Intent Prediction Of Pedestrians For Autonomous Ground Vehicles Via Spatio-Temporal Densenet",
              year = "2019"
          }
          
        Aliakbarian et al., "Viena: A Driving Anticipation Dataset", ACCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Aliakbarian_2018_ACCV,
              author = "Aliakbarian, Mohammad Sadegh and Saleh, Fatemeh Sadat and Salzmann, Mathieu and Fernando, Basura and Petersson, Lars and Andersson, Lars",
              editor = "Jawahar, C. V. and Li, Hongdong and Mori, Greg and Schindler, Konrad",
              title = "Viena: A Driving Anticipation Dataset",
              booktitle = "ACCV",
              year = "2019"
          }
          
        Rasouli et al., "Are They Going To Cross? A Benchmark Dataset And Baseline For Pedestrian Crosswalk Behavior", ICCVW, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Rasouli_2017_ICCVW,
              author = "Rasouli, Amir and Kotseruba, Iuliia and Tsotsos, John K.",
              title = "Are They Going To Cross? A Benchmark Dataset And Baseline For Pedestrian Crosswalk Behavior",
              booktitle = "ICCVW",
              year = "2017"
          }
          
        Mangalam et al., "Disentangling Human Dynamics for Pedestrian Locomotion Forecasting with Noisy Supervision", WACV, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Mangalam_2020_WACV,
              author = "Mangalam, Karttikeya and Adeli, Ehsan and Lee, Kuan-Hui and Gaidon, Adrien and Niebles, Juan Carlos",
              title = "Disentangling Human Dynamics for Pedestrian Locomotion Forecasting with Noisy Supervision",
              booktitle = "WACV",
              year = "2020"
          }
          
        Rasouli et al., "Pie: A Large-Scale Dataset And Models For Pedestrian Intention Estimation And Trajectory Prediction", ICCV, 2019. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Rasouli_2019_ICCV,
              author = "Rasouli, Amir and Kotseruba, Iuliia and Kunic, Toni and Tsotsos, John K.",
              title = "Pie: A Large-Scale Dataset And Models For Pedestrian Intention Estimation And Trajectory Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Afolabi et al., "People As Sensors: Imputing Maps From Human Actions", IROS, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Afolabi_2018_IROS,
              author = "Afolabi, O. and Driggs–Campbell, K. and Dong, R. and Kochenderfer, M. J. and Sastry, S. S.",
              booktitle = "IROS",
              title = "People As Sensors: Imputing Maps From Human Actions",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Rasouli_2017_ICCVW,
          author = "Rasouli, Amir and Kotseruba, Iuliia and Tsotsos, John K.",
          title = "Are They Going To Cross? A Benchmark Dataset And Baseline For Pedestrian Crosswalk Behavior",
          booktitle = "ICCVW",
          year = "2017"
      }
      
    Oxford Robot Car (ORC) link paper
    • Summary: A dataset containing 100 repetitions of a consistent route through Oxford driving by a vehicle under different weather and traffic conditions
    • Applications: Trajectory prediction
    • Data type and annotations: Stereo RGB, LIDAR, vehicle sensors
    • Task: Driving
      Used in papers
        Marchetti et al., "MANTRA: Memory Augmented Networks for Multiple Trajectory Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Marchetti_2020_CVPR,
              author = "Marchetti, Francesco and Becattini, Federico and Seidenari, Lorenzo and Del Bimbo, Alberto",
              title = "MANTRA: Memory Augmented Networks for Multiple Trajectory Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Srikanth et al., "Infer: Intermediate Representations For Future Prediction", IROS, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Srikanth_2019_IROS,
              author = "Srikanth, Shashank and Ansari, Junaid Ahmed and Sharma, Sarthak and others",
              booktitle = "IROS",
              title = "Infer: Intermediate Representations For Future Prediction",
              year = "2019"
          }
          
      Bibtex
      @Article{Maddern_2017_IJRR,
          Author = "Maddern, Will and Pascoe, Geoff and Linegar, Chris and Newman, Paul",
          Title = "1 Year, 1000Km: The Oxford Robotcar Dataset",
          Journal = "IJRR",
          Volume = "36",
          Number = "1",
          Pages = "3-15",
          Year = "2017"
      }
      
    STRANDS link paper arxiv
    • Summary: An RGBD dataset of objects with corresponding 3D bounding boxes collected using a mobile robot
    • Applications: Trajectory prediction
    • Data type and annotations: RGBD, 3D bounding box
    • Task: Driving
      Used in papers
        Sun et al., "3Dof Pedestrian Trajectory Prediction Learned From Long-Term Autonomous Mobile Robot Deployment Data", ICRA, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sun_2018_ICRA,
              author = "Sun, L. and Yan, Z. and Mellado, S. M. and Hanheide, M. and Duckett, T.",
              booktitle = "ICRA",
              title = "3Dof Pedestrian Trajectory Prediction Learned From Long-Term Autonomous Mobile Robot Deployment Data",
              year = "2018"
          }
          
      Bibtex
      @Article{Hawes_2017_RAM,
          author = "Hawes, Nick and Burbridge, Christopher and Jovan, Ferdian and Kunze, Lars and Lacerda, Bruno and Mudrova, Lenka and Young, Jay and Wyatt, Jeremy and Hebesberger, Denise and Kortner, Tobias and others",
          title = "The Strands Project: Long-Term Autonomy In Everyday Environments",
          journal = "IEEE Robotics \\\& Automation Magazine",
          volume = "24",
          number = "3",
          pages = "146--156",
          year = "2017"
      }
      
    Recipe1M link paper arxiv
    • Summary: A dataset of 1M+ cooking recipes with 13M food images
    • Applications: Action prediction
    • Data type and annotations: RGB (image), text
    • Task: Cooking
      Used in papers
        Sener et al., "Zero-Shot Anticipation For Instructional Activities", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sener_2019_ICCV,
              author = "Sener, Fadime and Yao, Angela",
              title = "Zero-Shot Anticipation For Instructional Activities",
              booktitle = "ICCV",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Salvador_2017_CVPR,
          author = "Salvador, Amaia and Hynes, Nicholas and Aytar, Yusuf and Marin, Javier and Ofli, Ferda and Weber, Ingmar and Torralba, Antonio",
          title = "Learning Cross-Modal Embeddings For Cooking Recipes And Food Images",
          booktitle = "CVPR",
          year = "2017"
      }
      
    PKU-MMD link arxiv
    • Summary: A multimodal dataset of 41 daily activities and 10 interaction actions recorded from 3 camera views using 60 subjects
    • Applications: Action prediction
    • Data type and annotations: RGBD, IR, 3D pose, multiview, temporal segment
    • Task: Activity, Interaction
      Used in papers
        Liu et al., "Ssnet: Scale Selection Network For Online 3D Action Prediction", CVPR, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Liu_2018_CVPR_ssnet,
              author = "Liu, Jun and Shahroudy, Amir and Wang, Gang and Duan, Ling-Yu and Kot, Alex C.",
              title = "Ssnet: Scale Selection Network For Online 3D Action Prediction",
              booktitle = "CVPR",
              year = "2018"
          }
          
      Bibtex
      @Article{Liu_2017_arxiv,
          author = "Chunhui, Liu and Yueyu, Hu and Yanghao, Li and Sijie, Song and Jiaying, Liu",
          title = "Pku-Mmd: A Large Scale Benchmark For Continuous Multi-Modal Human Action Understanding",
          journal = "arXiv:1703.07475",
          year = "2017"
      }
      
    Mouse Fish link paper arxiv
    • Summary: A dataset of fishes and mice in a lab environment with corresponding 2D poses
    • Applications: Motion prediction
    • Data type and annotations: Depth, 3D pose
    • Task: Animal
      Used in papers
        Liu et al., "Towards Natural And Accurate Future Motion Prediction Of Humans And Animals", CVPR, 2019. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Liu_2019_CVPR,
              author = "Liu, Zhenguang and Wu, Shuang and Jin, Shuyuan and Liu, Qi and Lu, Shijian and Zimmermann, Roger and Cheng, Li",
              title = "Towards Natural And Accurate Future Motion Prediction Of Humans And Animals",
              booktitle = "CVPR",
              year = "2019"
          }
          
      Bibtex
      @Article{Xu_2017_IJCV,
          author = "Xu, Chi and Govindarajan, Lakshmi Narasimhan and Zhang, Yu and Cheng, Li",
          title = "Lie-X: Depth Image Based Articulated Object Pose Estimation, Tracking, And Action Recognition On Lie Groups",
          journal = "IJCV",
          volume = "123",
          number = "3",
          pages = "454--478",
          year = "2017"
      }
      
    L-CAS link paper
    • Summary: A dataset of 28K indoor LIDAR scans showing the surroundings of a mobile robot in stationary or moving states
    • Applications: Trajectory prediction
    • Data type and annotations: LIDAR, 3D bounding box, attribute
    • Task: Driving
      Used in papers
        Sun et al., "3Dof Pedestrian Trajectory Prediction Learned From Long-Term Autonomous Mobile Robot Deployment Data", ICRA, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sun_2018_ICRA,
              author = "Sun, L. and Yan, Z. and Mellado, S. M. and Hanheide, M. and Duckett, T.",
              booktitle = "ICRA",
              title = "3Dof Pedestrian Trajectory Prediction Learned From Long-Term Autonomous Mobile Robot Deployment Data",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Yan_2017_IROS,
          author = "Yan, Zhi and Duckett, Tom and Bellotto, Nicola",
          title = "Online Learning For Human Classification In 3D Lidar-Based Tracking",
          booktitle = "IROS",
          year = "2017"
      }
      
    Epic-Fail link paper arxiv
    • Summary: A risk-assessment dataset of failed activity videos with 3K samples annotated at every 15 frames with bounding boxes around risky regions
    • Applications: Action prediction
    • Data type and annotations: RGB, bounding box, trajectory, temporal segment
    • Task: Risk assessment
      Used in papers
        Zeng et al., "Agent-Centric Risk Assessment: Accident Anticipation And Risky Region Localization", CVPR, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zeng_2017_CVPR,
              author = "Zeng, Kuo-Hao and Chou, Shih-Han and Chan, Fu-Hsiang and Carlos Niebles, Juan and Sun, Min",
              title = "Agent-Centric Risk Assessment: Accident Anticipation And Risky Region Localization",
              booktitle = "CVPR",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Zeng_2017_CVPR,
          author = "Zeng, Kuo-Hao and Chou, Shih-Han and Chan, Fu-Hsiang and Carlos Niebles, Juan and Sun, Min",
          title = "Agent-Centric Risk Assessment: Accident Anticipation And Risky Region Localization",
          booktitle = "CVPR",
          year = "2017"
      }
      
    CityPersons link paper arxiv
    • Summary: A subset of Cityscapes dataset with fine-grained annotations for pedestrians and vehicles in additional 20K images with a total of 35K+ bounding boxes for pedestrians
    • Applications: Trajectory prediction
    • Data type and annotations: Stereo RGB, bounding box, semantic segment
    • Task: Driving
      Used in papers
        Bhattacharyya et al., "Long-Term On-Board Prediction Of People In Traffic Scenes Under Uncertainty", CVPR, 2018. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Bhattacharyya_2018_CVPR,
              author = "Bhattacharyya, Apratim and Fritz, Mario and Schiele, Bernt",
              title = "Long-Term On-Board Prediction Of People In Traffic Scenes Under Uncertainty",
              booktitle = "CVPR",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Shanshan_2017_CVPR,
          Author = "Zhang, Shanshan and Benenson, Rodrigo and Schiele, Bernt",
          Title = "Citypersons: A Diverse Dataset For Pedestrian Detection",
          Booktitle = "CVPR",
          Year = "2017"
      }
      
    BU Action (BUA) link paper arxiv
    • Summary: A dataset of action images with 23K+ images and 101 activity classes collected from existing action video datasets
    • Applications: Action prediction
    • Data type and annotations: RGB (image), activity label
    • Task: Activity
      Used in papers
        Safaei et al., "Still Image Action Recognition By Predicting Spatial-Temporal Pixel Evolution", WACV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Safaei_2019_WACV,
              author = "Safaei, M. and Foroosh, H.",
              booktitle = "WACV",
              title = "Still Image Action Recognition By Predicting Spatial-Temporal Pixel Evolution",
              year = "2019"
          }
          
      Bibtex
      @Article{Ma_2017_PR,
          author = "Ma, Shugao and Bargal, Sarah Adel and Zhang, Jianming and Sigal, Leonid and Sclaroff, Stan",
          title = "Do Less And Achieve More: Training Cnns For Action Recognition Utilizing Action Images From The Web",
          journal = "Pattern Recognition",
          volume = "68",
          pages = "334--345",
          year = "2017"
      }
      
    Mapillary Vistas link paper
    • Summary: A dataset of street-level images with the corresponding instance and semantic segmentation
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, Bounding Box, Semantic Segment
    • Task: Driving
      Used in papers
        Makansi et al., "Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Makansi_2020_CVPR,
              author = "Makansi, Osama and Cicek, Ozgun and Buchicchio, Kevin and Brox, Thomas",
              title = "Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @InProceedings{Neuhold_2017_ICCV,
          author = "Neuhold, Gerhard and Ollmann, Tobias and Rota Bulo, Samuel and Kontschieder, Peter",
          title = "The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes",
          booktitle = "ICCV",
          year = "2017"
      }
      

2016

↑ top
    Cityscapes link paper arxiv
    • Summary: A driving dataset of street images with annotations for 30 traffic objects in 5k frames and weak annotations in 20k frames
    • Applications: Video prediction, Trajectory prediction, Other prediction
    • Data type and annotations: Stereo RGB, bounding box, semantic segment, vehicle sensors
    • Task: Driving
      Used in papers
        Wu et al., "Future Video Synthesis With Object Motion Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Wu_2020_CVPR,
              author = "Wu, Yue and Gao, Rongrong and Park, Jaesik and Chen, Qifeng",
              title = "Future Video Synthesis With Object Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Castrejon et al., "Improved Conditional Vrnns For Video Prediction", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Castrejon_2019_ICCV,
              author = "Castrejon, Lluis and Ballas, Nicolas and Courville, Aaron",
              title = "Improved Conditional Vrnns For Video Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Xu et al., "Structure Preserving Video Prediction", CVPR, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2018_CVPR,
              author = "Xu, Jingwei and Ni, Bingbing and Li, Zefan and Cheng, Shuo and Yang, Xiaokang",
              title = "Structure Preserving Video Prediction",
              booktitle = "CVPR",
              year = "2018"
          }
          
        Saric et al., "Warp to the Future: Joint Forecasting of Features and Feature Motion", CVPR, 2020. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Saric_2020_CVPR,
              author = "Saric, Josip and Orsic, Marin and Antunovic, Tonci and Vrazic, Sacha and Segvic, Sinisa",
              title = "Warp to the Future: Joint Forecasting of Features and Feature Motion",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Srikanth et al., "Infer: Intermediate Representations For Future Prediction", IROS, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Srikanth_2019_IROS,
              author = "Srikanth, Shashank and Ansari, Junaid Ahmed and Sharma, Sarthak and others",
              booktitle = "IROS",
              title = "Infer: Intermediate Representations For Future Prediction",
              year = "2019"
          }
          
        Terwilliger et al., "Recurrent Flow-Guided Semantic Forecasting", WACV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Terwilliger_2019_WACV,
              author = "Terwilliger, A. and Brazil, G. and Liu, X.",
              booktitle = "WACV",
              title = "Recurrent Flow-Guided Semantic Forecasting",
              year = "2019"
          }
          
        Luc et al., "Predicting Future Instance Segmentation By Forecasting Convolutional Features", ECCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Luc_2018_ECCV,
              author = "Luc, Pauline and Couprie, Camille and LeCun, Yann and Verbeek, Jakob",
              title = "Predicting Future Instance Segmentation By Forecasting Convolutional Features",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Luc et al., "Predicting Deeper Into The Future Of Semantic Segmentation", ICCV, 2017. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Luc_2017_ICCV,
              author = "Luc, Pauline and Neverova, Natalia and Couprie, Camille and Verbeek, Jakob and LeCun, Yann",
              title = "Predicting Deeper Into The Future Of Semantic Segmentation",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Jin et al., "Predicting Scene Parsing And Motion Dynamics In The Future", NeurIPS, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Jin_2017_NeurIPS,
              author = "Jin, Xiaojie and Xiao, Huaxin and Shen, Xiaohui and Yang, Jimei and Lin, Zhe and Chen, Yunpeng and Jie, Zequn and Feng, Jiashi and Yan, Shuicheng",
              title = "Predicting Scene Parsing And Motion Dynamics In The Future",
              booktitle = "NeurIPS",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Cordts_2016_CVPR,
          author = "Cordts, Marius and Omran, Mohamed and Ramos, Sebastian and Rehfeld, Timo and Enzweiler, Markus and Benenson, Rodrigo and Franke, Uwe and Roth, Stefan and Schiele, Bernt",
          title = "The Cityscapes Dataset For Semantic Urban Scene Understanding",
          booktitle = "CVPR",
          year = "2016"
      }
      
    Stanford Drone (SD) link paper
    • Summary: A dataset of pedestrians and cyclists movements recorded using an aerial drone with 3K+ tracks
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, bounding box, object class, Tracking ID
    • Task: Surveillance
      Used in papers
        Haddad et al., "Self-Growing Spatial Graph Networks for Pedestrian Trajectory Prediction", WACV, 2020. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Haddad_2020_WACV,
              author = "Haddad, Sirin and Lam, Siew-Kei",
              title = "Self-Growing Spatial Graph Networks for Pedestrian Trajectory Prediction",
              booktitle = "WACV",
              year = "2020"
          }
          
        Li, "Which Way Are You Going? Imitative Decision Learning For Path Forecasting In Dynamic Scenes", CVPR, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2019_CVPR,
              author = "Li, Yuke",
              title = "Which Way Are You Going? Imitative Decision Learning For Path Forecasting In Dynamic Scenes",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Zhao et al., "Multi-Agent Tensor Fusion For Contextual Trajectory Prediction", CVPR, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zhao_2019_CVPR,
              author = "Zhao, Tianyang and Xu, Yifei and Monfort, Mathew and Choi, Wongun and Baker, Chris and Zhao, Yibiao and Wang, Yizhou and Wu, Ying Nian",
              title = "Multi-Agent Tensor Fusion For Contextual Trajectory Prediction",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Choi et al., "Looking To Relations For Future Trajectory Forecast", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Choi_2019_ICCV,
              author = "Choi, Chiho and Dariush, Behzad",
              title = "Looking To Relations For Future Trajectory Forecast",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Li et al., "Conditional Generative Neural System For Probabilistic Trajectory Prediction", IROS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2019_IROS,
              author = "Li, Jiachen and Ma, Hengbo and Tomizuka, Masayoshi",
              booktitle = "IROS",
              title = "Conditional Generative Neural System For Probabilistic Trajectory Prediction",
              year = "2019"
          }
          
        Xue et al., "Location-Velocity Attention For Pedestrian Trajectory Prediction", WACV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xue_2019_WACV,
              author = "Xue, H. and Huynh, D. and Reynolds, M.",
              booktitle = "WACV",
              title = "Location-Velocity Attention For Pedestrian Trajectory Prediction",
              year = "2019"
          }
          
        Lee et al., "Desire: Distant Future Prediction In Dynamic Scenes With Interacting Agents", CVPR, 2017. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Lee_2017_CVPR,
              author = "Lee, Namhoon and Choi, Wongun and Vernaza, Paul and Choy, Christopher B. and Torr, Philip H. S. and Chandraker, Manmohan",
              title = "Desire: Distant Future Prediction In Dynamic Scenes With Interacting Agents",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Ballan et al., "Knowledge Transfer For Scene-Specific Motion Prediction", ECCV, 2016. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Ballan_2016_ECCV,
              author = "Ballan, Lamberto and Castaldo, Francesco and Alahi, Alexandre and Palmieri, Francesco and Savarese, Silvio",
              editor = "Leibe, Bastian and Matas, Jiri and Sebe, Nicu and Welling, Max",
              title = "Knowledge Transfer For Scene-Specific Motion Prediction",
              booktitle = "ECCV",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Robicquet_2016_ECCV,
          author = "Robicquet, Alexandre and Sadeghian, Amir and Alahi, Alexandre and Savarese, Silvio",
          title = "Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes",
          booktitle = "ECCV",
          year = "2016"
      }
      
    CMU Mocap link
    • Summary: A motion dataset consists of various activities including human interaction, interaction with the environment, locomotion, sports, etc.
    • Applications: Action prediction, Motion prediction
    • Data type and annotations: 3D pose, activity label
    • Task: Activity
      Used in papers
        Butepage et al., "Deep Representation Learning For Human Motion Prediction And Classification", CVPR, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Butepage_2017_CVPR,
              author = "Butepage, Judith and Black, Michael J. and Kragic, Danica and Kjellstrom, Hedvig",
              title = "Deep Representation Learning For Human Motion Prediction And Classification",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Aliakbarian et al., "A Stochastic Conditioning Scheme for Diverse Human Motion Prediction", CVPR, 2020. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Aliakbarian_2020_CVPR,
              author = "Aliakbarian, Sadegh and Saleh, Fatemeh Sadat and Salzmann, Mathieu and Petersson, Lars and Gould, Stephen",
              title = "A Stochastic Conditioning Scheme for Diverse Human Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Cui et al., "Learning Dynamic Relationships for 3D Human Motion Prediction", CVPR, 2020. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Cui_2020_CVPR,
              author = "Cui, Qiongjie and Sun, Huaijiang and Yang, Fei",
              title = "Learning Dynamic Relationships for 3D Human Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Li et al., "Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2020_CVPR,
              author = "Li, Maosen and Chen, Siheng and Zhao, Yangheng and Zhang, Ya and Wang, Yanfeng and Tian, Qi",
              title = "Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Mao et al., "Learning Trajectory Dependencies For Human Motion Prediction", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Mao_2019_ICCV,
              author = "Mao, Wei and Liu, Miaomiao and Salzmann, Mathieu and Li, Hongdong",
              title = "Learning Trajectory Dependencies For Human Motion Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
      Bibtex
      @Misc{CMU_Mocap_2016,
          author = "CMU",
          title = "Cmu Graphics Lab Motion Capture Database",
          howpublished = "http://mocap.cs.cmu.edu/",
          year = "2016"
      }
      
    BAIR Push link paper arxiv
    • Summary: A dataset of object manipulation using a robot arm with 59k object pushing motion samples
    • Applications: Video prediction
    • Data type and annotations: RGB
    • Task: Robot object manipulation
      Used in papers
        Jin et al., "Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Jin_2020_CVPR,
              author = "Jin, Beibei and Hu, Yu and Tang, Qiankun and Niu, Jingyu and Shi, Zhiping and Han, Yinhe and Li, Xiaowei",
              title = "Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Castrejon et al., "Improved Conditional Vrnns For Video Prediction", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Castrejon_2019_ICCV,
              author = "Castrejon, Lluis and Ballas, Nicolas and Courville, Aaron",
              title = "Improved Conditional Vrnns For Video Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Xu et al., "Video Prediction Via Selective Sampling", NeurIPS, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2018_NeurIPS,
              author = "Xu, Jingwei and Ni, Bingbing and Yang, Xiaokang",
              title = "Video Prediction Via Selective Sampling",
              booktitle = "NeurIPS",
              year = "2018"
          }
          
        Finn et al., "Unsupervised Learning For Physical Interaction Through Video Prediction", NeurIPS, 2016. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Finn_2016_NeurIPS,
              author = "Finn, Chelsea and Goodfellow, Ian and Levine, Sergey",
              title = "Unsupervised Learning For Physical Interaction Through Video Prediction",
              booktitle = "NeurIPS",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Finn_2016_NeurIPS,
          author = "Finn, Chelsea and Goodfellow, Ian and Levine, Sergey",
          title = "Unsupervised Learning For Physical Interaction Through Video Prediction",
          booktitle = "NeurIPS",
          year = "2016"
      }
      
    Dashcam Accident Dataset (DAD) link paper
    • Summary: A dataset of 620 video sequences of traffic accidents recorded in six cities
    • Applications: Action prediction
    • Data type and annotations: RGB, bounding box, object class, temporal segment, Tracking ID
    • Task: Driving
      Used in papers
        Suzuki et al., "Anticipating Traffic Accidents With Adaptive Loss And Large-Scale Incident Db", The CVPR, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Suzuki_2018_CVPR,
              author = "Suzuki, Tomoyuki and Kataoka, Hirokatsu and Aoki, Yoshimitsu and Satoh, Yutaka",
              title = "Anticipating Traffic Accidents With Adaptive Loss And Large-Scale Incident Db",
              booktitle = "The CVPR",
              year = "2018"
          }
          
        Zeng et al., "Agent-Centric Risk Assessment: Accident Anticipation And Risky Region Localization", CVPR, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zeng_2017_CVPR,
              author = "Zeng, Kuo-Hao and Chou, Shih-Han and Chan, Fu-Hsiang and Carlos Niebles, Juan and Sun, Min",
              title = "Agent-Centric Risk Assessment: Accident Anticipation And Risky Region Localization",
              booktitle = "CVPR",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Chan_2016_ACCV,
          author = "Chan, Fu-Hsiang and Chen, Yu-Ting and Xiang, Yu and Sun, Min",
          editor = "Lai, Shang-Hong and Lepetit, Vincent and Nishino, Ko and Sato, Yoichi",
          title = "Anticipating Accidents In Dashcam Videos",
          booktitle = "ACCV",
          year = "2017"
      }
      
    Youtube-8M link arxiv
    • Summary: A large-scale dataset of videos collected from YouTube with corresponding machine-generated annotations from a vocabulary of 3.8K+ visual entities
    • Applications: Video prediction
    • Data type and annotations: RGB, activity label, temporal segment
    • Task: Activity
      Used in papers
        Reda et al., "Sdc-Net: Video Prediction Using Spatially-Displaced Convolution", ECCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Reda_2018_ECCV,
              author = "Reda, Fitsum A. and Liu, Guilin and Shih, Kevin J. and Kirby, Robert and Barker, Jon and Tarjan, David and Tao, Andrew and Catanzaro, Bryan",
              title = "Sdc-Net: Video Prediction Using Spatially-Displaced Convolution",
              booktitle = "ECCV",
              year = "2018"
          }
          
      Bibtex
      @Article{Abu_2016_arxiv,
          author = "Abu-El-Haija, Sami and Kothari, Nisarg and Lee, Joonseok and Natsev, Paul and Toderici, George and Varadarajan, Balakrishnan and Vijayanarasimhan, Sudheendra",
          title = "Youtube-8M: A Large-Scale Video Classification Benchmark",
          journal = "arXiv:1609.08675",
          year = "2016"
      }
      
    Visual Storytelling (VIST) link paper
    • Summary: A dataset of 80K+ images collected from 21K+ sequences with corresponding text captions
    • Applications: Other prediction
    • Data type and annotations: RGB, text
    • Task: Visual story
      Used in papers
        Zeng et al., "Visual Forecasting By Imitating Dynamics In Natural Sequences", ICCV, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zeng_2017_ICCV,
              author = "Zeng, Kuo-Hao and Shen, William B. and Huang, De-An and Sun, Min and Carlos Niebles, Juan",
              title = "Visual Forecasting By Imitating Dynamics In Natural Sequences",
              booktitle = "ICCV",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Huang_2016_NAACL,
          author = "Huang, Ting-Hao K. and Ferraro, Francis and Mostafazadeh, Nasrin and Misra, Ishan and Devlin, Jacob and Agrawal, Aishwarya and Girshick, Ross and He, Xiaodong and Kohli, Pushmeet and Batra, Dhruv and others",
          title = "Visual Storytelling",
          booktitle = "NAACL",
          year = "2016"
      }
      
    TV Series link paper arxiv
    • Summary: A collection of sequences collected from TV series for the purpose of action detection
    • Applications: Action prediction
    • Data type and annotations: RGB, activity label, temporal segment
    • Task: Activity
      Used in papers
        Gao et al., "Red: Reinforced Encoder-Decoder Networks For Action Anticipation", BMVC, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Gao_2017_BMVC,
              author = "Gao, Jiyang and Yang, Zhenheng and Nevatia, Ram",
              title = "Red: Reinforced Encoder-Decoder Networks For Action Anticipation",
              year = "2017",
              booktitle = "BMVC"
          }
          
      Bibtex
      @InProceedings{De_2016_ECCV,
          author = "De Geest, Roeland and Gavves, Efstratios and Ghodrati, Amir and Li, Zhenyang and Snoek, Cees and Tuytelaars, Tinne",
          title = "Online Action Detection",
          booktitle = "ECCV",
          year = "2016"
      }
      
    Online Action Detection (OAD) link paper arxiv
    • Summary: A dataset of 10 indoor activities in 59 sequences collected using a Kinect V2 sensor
    • Applications: Action prediction
    • Data type and annotations: RGBD, 3D pose, activity label, temporal segment, Tracking ID
    • Task: Activity
      Used in papers
        Liu et al., "Ssnet: Scale Selection Network For Online 3D Action Prediction", CVPR, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Liu_2018_CVPR_ssnet,
              author = "Liu, Jun and Shahroudy, Amir and Wang, Gang and Duan, Ling-Yu and Kot, Alex C.",
              title = "Ssnet: Scale Selection Network For Online 3D Action Prediction",
              booktitle = "CVPR",
              year = "2018"
          }
          
      Bibtex
      @Article{Li_2016_ECCV,
          author = "Li, Yanghao and Lan, Cuiling and Xing, Junliang and Zeng, Wenjun and Yuan, Chunfeng and Liu, Jiaying",
          title = "Online Human Action Detection Using Joint Classification-Regression Recurrent Neural Networks",
          journal = "ECCV",
          year = "2016"
      }
      
    Ongoing Activity (OA) link paper
    • Summary: A dataset of 450+ activities, such as cooking, house chores, etc., videos collected from public video sharing websites
    • Applications: Action prediction
    • Data type and annotations: RGB, activity label
    • Task: Activity
      Used in papers
        Li et al., "Recognition Of Ongoing Complex Activities By Sequence Prediction Over A Hierarchical Label Space", WACV, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2016_WACV,
              author = "Li, W. and Fritz, M.",
              booktitle = "WACV",
              title = "Recognition Of Ongoing Complex Activities By Sequence Prediction Over A Hierarchical Label Space",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Li_2016_WACV,
          author = "Li, W. and Fritz, M.",
          booktitle = "WACV",
          title = "Recognition Of Ongoing Complex Activities By Sequence Prediction Over A Hierarchical Label Space",
          year = "2016"
      }
      
    NTU RGB-D link paper arxiv
    • Summary: An action dataset of 60 daily activities in 56K+ video samples
    • Applications: Action prediction
    • Data type and annotations: RGBD, IR, 3D pose, activity label
    • Task: Activity
      Used in papers
        Wang et al., "Progressive Teacher-Student Learning For Early Action Prediction", CVPR, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2019_CVPR,
              author = "Wang, Xionghui and Hu, Jian-Fang and Lai, Jian-Huang and Zhang, Jianguo and Zheng, Wei-Shi",
              title = "Progressive Teacher-Student Learning For Early Action Prediction",
              booktitle = "CVPR",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Shahroudy_2016_CVPR,
          author = "Shahroudy, Amir and Liu, Jun and Ng, Tian-Tsong and Wang, Gang",
          title = "Ntu Rgb+D: A Large Scale Dataset For 3D Human Activity Analysis",
          booktitle = "CVPR",
          year = "2016"
      }
      
    Miss Universe (MU) link paper arxiv
    • Summary: A dataset of catwalks by Miss Universe contestants during the evening gown competition from 1996 to 2010
    • Applications: Other prediction
    • Data type and annotations: RGB, bounding box, scores
    • Task: Miss universe
      Used in papers
        Carvajal et al., "Towards Miss Universe Automatic Prediction: The Evening Gown Competition", ICPR, 2016. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Carvajal_2016_ICPR,
              author = "Carvajal, J. and Wiliem, A. and Sanderson, C. and Lovell, B.",
              booktitle = "ICPR",
              title = "Towards Miss Universe Automatic Prediction: The Evening Gown Competition",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Carvajal_2016_ICPR,
          author = "Carvajal, J. and Wiliem, A. and Sanderson, C. and Lovell, B.",
          booktitle = "ICPR",
          title = "Towards Miss Universe Automatic Prediction: The Evening Gown Competition",
          year = "2016"
      }
      
    Bouncing Ball (BB) link paper arxiv
    • Summary: A simulated dataset of bounding balls generated using Neural Physics Engine
    • Applications: Video prediction
    • Data type and annotations: RGB
    • Task: Object (simulation)
      Used in papers
        Hsieh et al., "Learning To Decompose And Disentangle Representations For Video Prediction", NeurIPS, 2018. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Hsieh_2018_NeurIPS,
              author = "Hsieh, Jun-Ting and Liu, Bingbin and Huang, De-An and Fei-Fei, Li F and Niebles, Juan Carlos",
              title = "Learning To Decompose And Disentangle Representations For Video Prediction",
              booktitle = "NeurIPS",
              year = "2018"
          }
          
      Bibtex
      @Article{Chang_2016_arxiv,
          author = "Chang, Michael B and Ullman, Tomer and Torralba, Antonio and Tenenbaum, Joshua B",
          title = "A Compositional Object-Based Approach To Learning Physical Dynamics",
          journal = "arXiv:1612.00341",
          year = "2016"
      }
      

2015

↑ top
    Moving MNIST (MMNIST) link paper arxiv
    • Summary: A dataset of moving digits on a simple uniform background
    • Applications: Video prediction
    • Data type and annotations: Grayscale
    • Task: Digit
      Used in papers
        Le et al., "Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Guen_2020_CVPR,
              author = "Le Guen, Vincent and Thome, Nicolas",
              title = "Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Wang et al., "Probabilistic Video Prediction From Noisy Data With a Posterior Confidence", CVPR, 2020. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2020_CVPR,
              author = "Wang, Yunbo and Wu, Jiajun and Long, Mingsheng and Tenenbaum, Joshua B.",
              title = "Probabilistic Video Prediction From Noisy Data With a Posterior Confidence",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Castrejon et al., "Improved Conditional Vrnns For Video Prediction", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Castrejon_2019_ICCV,
              author = "Castrejon, Lluis and Ballas, Nicolas and Courville, Aaron",
              title = "Improved Conditional Vrnns For Video Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Lee et al., "Mutual Suppression Network For Video Prediction Using Disentangled Features", BMVC, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Lee_2019_BMVC,
              author = "Lee, Jungbeom and Lee, Jangho and Lee, Sungmin and Yoon, Sungroh",
              title = "Mutual Suppression Network For Video Prediction Using Disentangled Features",
              year = "2019",
              booktitle = "BMVC"
          }
          
        Wang et al., "Order Matters: Shuffling Sequence Generation For Video Prediction", BMVC, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2019_BMVC,
              author = "Wang, Junyan and Hu, Bingzhang and Long, Yang and Guan, Yu",
              title = "Order Matters: Shuffling Sequence Generation For Video Prediction",
              year = "2019",
              booktitle = "BMVC"
          }
          
        Oliu et al., "Folded Recurrent Neural Networks For Future Video Prediction", ECCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Oliu_2018_ECCV,
              author = "Oliu, Marc and Selva, Javier and Escalera, Sergio",
              title = "Folded Recurrent Neural Networks For Future Video Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Hsieh et al., "Learning To Decompose And Disentangle Representations For Video Prediction", NeurIPS, 2018. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Hsieh_2018_NeurIPS,
              author = "Hsieh, Jun-Ting and Liu, Bingbin and Huang, De-An and Fei-Fei, Li F and Niebles, Juan Carlos",
              title = "Learning To Decompose And Disentangle Representations For Video Prediction",
              booktitle = "NeurIPS",
              year = "2018"
          }
          
        Xu et al., "Video Prediction Via Selective Sampling", NeurIPS, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2018_NeurIPS,
              author = "Xu, Jingwei and Ni, Bingbing and Yang, Xiaokang",
              title = "Video Prediction Via Selective Sampling",
              booktitle = "NeurIPS",
              year = "2018"
          }
          
        Lu et al., "Flexible Spatio-Temporal Networks For Video Prediction", CVPR, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Lu_2017_CVPR,
              author = "Lu, Chaochao and Hirsch, Michael and Scholkopf, Bernhard",
              title = "Flexible Spatio-Temporal Networks For Video Prediction",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Zeng et al., "Visual Forecasting By Imitating Dynamics In Natural Sequences", ICCV, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zeng_2017_ICCV,
              author = "Zeng, Kuo-Hao and Shen, William B. and Huang, De-An and Sun, Min and Carlos Niebles, Juan",
              title = "Visual Forecasting By Imitating Dynamics In Natural Sequences",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Wang et al., "Predrnn: Recurrent Neural Networks For Predictive Learning Using Spatiotemporal Lstms", NeurIPS, 2017. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2017_NeurIPS,
              author = "Wang, Yunbo and Long, Mingsheng and Wang, Jianmin and Gao, Zhifeng and Yu, Philip S",
              title = "Predrnn: Recurrent Neural Networks For Predictive Learning Using Spatiotemporal Lstms",
              booktitle = "NeurIPS",
              year = "2017"
          }
          
        Oh et al., "HCNAF: Hyper-Conditioned Neural Autoregressive Flow and its Application for Probabilistic Occupancy Map Forecasting", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Oh_2020_CVPR,
              author = "Oh, Geunseob and Valois, Jean-Sebastien",
              title = "HCNAF: Hyper-Conditioned Neural Autoregressive Flow and its Application for Probabilistic Occupancy Map Forecasting",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @InProceedings{Srivastava_2015_ICML,
          author = "Srivastava, Nitish and Mansimov, Elman and Salakhudinov, Ruslan",
          title = "Unsupervised Learning Of Video Representations Using Lstms",
          booktitle = "ICML",
          year = "2015"
      }
      
    THUMOS link
    • Summary: A dataset of 20K+ videos of 101 diverse action classes
    • Applications: Video prediction, Action prediction
    • Data type and annotations: RGB, activity label, temporal segment
    • Task: Activity
      Used in papers
        Liang et al., "Dual Motion Gan For Future-Flow Embedded Video Prediction", ICCV, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Liang_2017_ICCV,
              author = "Liang, Xiaodan and Lee, Lisa and Dai, Wei and Xing, Eric P.",
              title = "Dual Motion Gan For Future-Flow Embedded Video Prediction",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Zhong et al., "Unsupervised Learning For Forecasting Action Representations", ICIP, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Zhong_2018_ICIP,
              author = "Zhong, Y. and Zheng, W.",
              booktitle = "ICIP",
              title = "Unsupervised Learning For Forecasting Action Representations",
              year = "2018"
          }
          
        Gao et al., "Red: Reinforced Encoder-Decoder Networks For Action Anticipation", BMVC, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Gao_2017_BMVC,
              author = "Gao, Jiyang and Yang, Zhenheng and Nevatia, Ram",
              title = "Red: Reinforced Encoder-Decoder Networks For Action Anticipation",
              year = "2017",
              booktitle = "BMVC"
          }
          
        Vondrick et al., "Anticipating Visual Representations From Unlabeled Video", CVPR, 2016. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Vondrick_2016_CVPR_2,
              author = "Vondrick, Carl and Pirsiavash, Hamed and Torralba, Antonio",
              title = "Anticipating Visual Representations From Unlabeled Video",
              booktitle = "CVPR",
              year = "2016"
          }
          
      Bibtex
      @Misc{Gorban_2015,
          author = "Gorban, A. and Idrees, H. and Jiang, Y.-G. and Roshan Zamir, A. and Laptev, I. and Shah, M. and Sukthankar, R.",
          title = "Thumos Challenge: Action Recognition With A Large Number Of Classes",
          howpublished = "\url{http://www.thumos.info/}",
          Year = "2015"
      }
      
    Brain4Cars link paper arxiv
    • Summary: A dataset of 700 driving events using inside and outside looking cameras with annotated actions for various driving maneuvers
    • Applications: Action prediction
    • Data type and annotations: RGB, bounding box, attribute, temporal segment, vehicle sensors
    • Task: Driving
      Used in papers
        Jain et al., "Structural-Rnn: Deep Learning On Spatio-Temporal Graphs", CVPR, 2016. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Jain_2016_CVPR,
              author = "Jain, Ashesh and Zamir, Amir R. and Savarese, Silvio and Saxena, Ashutosh",
              title = "Structural-Rnn: Deep Learning On Spatio-Temporal Graphs",
              booktitle = "CVPR",
              year = "2016"
          }
          
        Jain et al., "Recurrent Neural Networks For Driver Activity Anticipation Via Sensory-Fusion Architecture", ICRA, 2016. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Jain_2016_ICRA,
              author = "Jain, A. and Singh, A. and Koppula, H. S. and Soh, S. and Saxena, A.",
              booktitle = "ICRA",
              title = "Recurrent Neural Networks For Driver Activity Anticipation Via Sensory-Fusion Architecture",
              year = "2016"
          }
          
        Jain et al., "Car That Knows Before You Do: Anticipating Maneuvers Via Learning Temporal Driving Models", ICCV, 2015. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Jain_2015_ICCV,
              author = "Jain, Ashesh and Koppula, Hema S. and Raghavan, Bharad and Soh, Shane and Saxena, Ashutosh",
              title = "Car That Knows Before You Do: Anticipating Maneuvers Via Learning Temporal Driving Models",
              booktitle = "ICCV",
              year = "2015"
          }
          
      Bibtex
      @InProceedings{Jain_2015_ICCV,
          author = "Jain, Ashesh and Koppula, Hema S. and Raghavan, Bharad and Soh, Shane and Saxena, Ashutosh",
          title = "Car That Knows Before You Do: Anticipating Maneuvers Via Learning Temporal Driving Models",
          booktitle = "ICCV",
          year = "2015"
      }
      
    SYSU 3DHOI link paper
    • Summary: A dataset of 12 simple activities in 480 video clips with depth maps
    • Applications: Action prediction
    • Data type and annotations: RGBD, 3D pose, activity label
    • Task: Object interaction
      Used in papers
        Wang et al., "Progressive Teacher-Student Learning For Early Action Prediction", CVPR, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2019_CVPR,
              author = "Wang, Xionghui and Hu, Jian-Fang and Lai, Jian-Huang and Zhang, Jianguo and Zheng, Wei-Shi",
              title = "Progressive Teacher-Student Learning For Early Action Prediction",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Hu et al., "Real-Time Rgb-D Activity Prediction By Soft Regression", ECCV, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Hu_2016_ECCV,
              author = "Hu, Jian-Fang and Zheng, Wei-Shi and Ma, Lianyang and Wang, Gang and Lai, Jianhuang",
              editor = "Leibe, Bastian and Matas, Jiri and Sebe, Nicu and Welling, Max",
              title = "Real-Time Rgb-D Activity Prediction By Soft Regression",
              booktitle = "ECCV",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Hu_2015_CVPR,
          author = "Hu, Jian-Fang and Zheng, Wei-Shi and Lai, Jianhuang and Zhang, Jianguo",
          title = "Jointly Learning Heterogeneous Features For Rgb-D Activity Recognition",
          booktitle = "CVPR",
          year = "2015"
      }
      
    WIDER link paper
    • Summary: A complex event dataset of 61 event categories in 50K+ images
    • Applications: Action prediction
    • Data type and annotations: RGB (image), activity label
    • Task: Activity
      Used in papers
        Safaei et al., "Still Image Action Recognition By Predicting Spatial-Temporal Pixel Evolution", WACV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Safaei_2019_WACV,
              author = "Safaei, M. and Foroosh, H.",
              booktitle = "WACV",
              title = "Still Image Action Recognition By Predicting Spatial-Temporal Pixel Evolution",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Xiong_2015_CVPR,
          author = "Xiong, Yuanjun and Zhu, Kai and Lin, Dahua and Tang, Xiaoou",
          title = "Recognize Complex Events From Static Images By Fusing Deep Channels",
          booktitle = "CVPR",
          year = "2015"
      }
      
    Watch-n-Push (WnP) link paper arxiv
    • Summary: A dataset of 458 videos of 21 daily actions in office and kitchen environments recorded using a Kinect V2 sensor
    • Applications: Action prediction
    • Data type and annotations: RGBD, 3D pose, activity label, temporal segment
    • Task: Activity
      Used in papers
        Kataoka et al., "Recognition Of Transitional Action For Short-Term Action Prediction Using Discriminative Temporal Cnn Feature", BMVC, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Kataoka_2016_BMVC,
              author = "Kataoka, Hirokatsu and Miyashita, Yudai and Hayashi, Masaki and Iwata, Kenji and Satoh, Yutaka",
              title = "Recognition Of Transitional Action For Short-Term Action Prediction Using Discriminative Temporal Cnn Feature",
              year = "2016",
              booktitle = "BMVC"
          }
          
      Bibtex
      @InProceedings{Wu_2015_CVPR,
          author = "Wu, Chenxia and Zhang, Jiemi and Savarese, Silvio and Saxena, Ashutosh",
          title = "Watch-N-Patch: Unsupervised Understanding Of Actions And Relations",
          booktitle = "CVPR",
          year = "2015"
      }
      
    SUN RGB-D link paper
    • Summary: A dataset of 10K RGB-D images of indoor environments with the corresponding 2D and 3D annotations
    • Applications: Other prediction
    • Data type and annotations: RGBD, 3D bounding box, object class
    • Task: Place
      Used in papers
        Mottaghi et al., "What Happens If... Learning To Predict The Effect Of Forces In Images", ECCV, 2016. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Mottaghi_2016_ECCV,
              author = "Mottaghi, Roozbeh and Rastegari, Mohammad and Gupta, Abhinav and Farhadi, Ali",
              editor = "Leibe, Bastian and Matas, Jiri and Sebe, Nicu and Welling, Max",
              title = "What Happens If... Learning To Predict The Effect Of Forces In Images",
              booktitle = "ECCV",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Song_2015_CVPR_2,
          author = "Song, Yale and Vallmitjana, J. and Stent, A. and Jaimes, A.",
          booktitle = "CVPR",
          title = "Tvsum: Summarizing Web Videos Using Titles",
          year = "2015"
      }
      
    MOT link arxiv
    • Summary: A collection of videos from existing datasets for the purpose of object tracking
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, bounding box
    • Task: Surveillance
      Used in papers
        Sanchez-Matilla et al., "A Predictor Of Moving Objects For First-Person Vision", ICIP, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Sanchez_2019_ICIP,
              author = "Sanchez-Matilla, R. and Cavallaro, A.",
              booktitle = "ICIP",
              title = "A Predictor Of Moving Objects For First-Person Vision",
              year = "2019"
          }
          
      Bibtex
      @Article{Leal_2015_arxiv,
          author = "Leal-Taix\'e, Laura and Milan, Anton and Reid, Ian and Roth, Stefan and Schindler, Konrad",
          title = "Motchallenge 2015: Towards A Benchmark For Multi-Target Tracking",
          journal = "arXiv:1504.01942",
          year = "2015"
      }
      
    MicroBlog-Images (MBI-1M) link paper
    • Summary: An activity dataset of 1M images and the content of the corresponding tweets collected in years 2013-14
    • Applications: Other prediction
    • Data type and annotations: RGB (image), attribute, text
    • Task: Tweet
      Used in papers
        Wang et al., "Retweet Wars: Tweet Popularity Prediction Via Dynamic Multimodal Regression", WACV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2018_WACV,
              author = "Wang, K. and Bansal, M. and Frahm, J.",
              booktitle = "WACV",
              title = "Retweet Wars: Tweet Popularity Prediction Via Dynamic Multimodal Regression",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Cappallo_2015_ICMR,
          author = "Cappallo, Spencer and Mensink, Thomas and Snoek, Cees GM",
          title = "Latent Factors Of Visual Popularity Prediction",
          booktitle = "ICMR",
          year = "2015"
      }
      
    Georgia Tech Egocentric Activity Gaze+ (GTEA Gaze+) link paper
    • Summary: An egocentric dataset of 37 videos of 7 cooking activities recorded from 26 subjects with the corresponding gaze tracking information
    • Applications: Action prediction
    • Data type and annotations: RGB, gaze, mask, activity label, temporal segment
    • Task: Cooking (egocentric)
      Used in papers
        Shen et al., "Egocentric Activity Prediction Via Event Modulated Attention", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Shen_2018_ECCV,
              author = "Shen, Yang and Ni, Bingbing and Li, Zefan and Zhuang, Ning",
              title = "Egocentric Activity Prediction Via Event Modulated Attention",
              booktitle = "ECCV",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Li_2015_CVPR,
          author = "Li, Yin and Ye, Zhefan and Rehg, James M",
          title = "Delving Into Egocentric Actions",
          booktitle = "CVPR",
          year = "2015"
      }
      
    First Person Personalized Activities (FPPA) link paper
    • Summary: An egocentric dataset of 5 daily activities, such as drinking water, using a fridge, etc., consists of 591 video clips recorded at 30fps
    • Applications: Action prediction
    • Data type and annotations: RGB, activity label, temporal segment
    • Task: Activity (egocentric)
      Used in papers
        Zhou et al., "Temporal Perception And Prediction In Ego-Centric Video", ICCV, 2015. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhou_2015_ICCV,
              author = "Zhou, Yipin and Berg, Tamara L.",
              title = "Temporal Perception And Prediction In Ego-Centric Video",
              booktitle = "ICCV",
              year = "2015"
          }
          
      Bibtex
      @InProceedings{Zhou_2015_ICCV,
          author = "Zhou, Yipin and Berg, Tamara L.",
          title = "Temporal Perception And Prediction In Ego-Centric Video",
          booktitle = "ICCV",
          year = "2015"
      }
      
    CMU Panoptic link paper arxiv
    • Summary: A multiview group activity dataset recorded with 10 RGB-D sensors and 30+ HD views with the corresponding 3D annotations
    • Applications: Action prediction, Motion prediction
    • Data type and annotations: RGBD, multiview, 3D pose, 3D facial landmark, Transcripts
    • Task: Interaction
      Used in papers
        Joo et al., "Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction In A Triadic Interaction", CVPR, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Joo_2019_CVPR,
              author = "Joo, Hanbyul and Simon, Tomas and Cikara, Mina and Sheikh, Yaser",
              title = "Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction In A Triadic Interaction",
              booktitle = "CVPR",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Joo_2015_ICCV_2,
          author = "Joo, Hanbyul and Liu, Hao and Tan, Lei and Gui, Lin and Nabbe, Bart and Matthews, Iain and Kanade, Takeo and Nobuhara, Shohei and Sheikh, Yaser",
          title = "Panoptic Studio: A Massively Multiview System For Social Motion Capture",
          booktitle = "ICCV",
          year = "2015"
      }
      
    Amazon link paper arxiv
    • Summary: A dataset of 142M+ product reviews from Amazon with corresponding metadata including price, brand, descriptions, category information, etc.
    • Applications: Other prediction
    • Data type and annotations: Features, attribute, text
    • Task: Fashion
      Used in papers
        Al-Halah et al., "Fashion Forward: Forecasting Visual Style In Fashion", ICCV, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Al-Halah_2017_ICCV,
              author = "Al-Halah, Ziad and Stiefelhagen, Rainer and Grauman, Kristen",
              title = "Fashion Forward: Forecasting Visual Style In Fashion",
              booktitle = "ICCV",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Mcauley_2015_CRDIR,
          author = "McAuley, Julian and Targett, Christopher and Shi, Qinfeng and Van Den Hengel, Anton",
          title = "Image-Based Recommendations On Styles And Substitutes",
          booktitle = "SIGIR",
          year = "2015"
      }
      
    Whole-Body Human Motion (WBHM) link paper
    • Summary: A human motion dataset consists of 2.4K+ experiments using 224 subjects and 135 objects
    • Applications: Motion prediction
    • Data type and annotations: RGB, 3D Pose
    • Task: Pose
      Used in papers
        Corona et al., "Context-Aware Human Motion Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Corona_2020_CVPR,
              author = "Corona, Enric and Pumarola, Albert and Alenya, Guillem and Moreno-Noguer, Francesc",
              title = "Context-Aware Human Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @inproceedings{Mandery_2015_ICAR,
          author = "Mandery, Christian and Terlemez, Omer and Do, Martin and Vahrenkamp, Nikolaus and Asfour, Tamim",
          title = "The KIT whole-body human motion database",
          booktitle = "ICAR",
          year = "2015"
      }
      

2014

↑ top
    Human3.6M link paper
    • Summary: A large-scale dataset of 3D human poses with 3M+ images captured using 11 professional actors in 17 scenarios, such as discussion, smoking, taking photo, etc.
    • Applications: Video prediction, Action prediction, Motion prediction
    • Data type and annotations: RGB, 3D pose, activity label
    • Task: Activity
      Used in papers
        Le et al., "Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Guen_2020_CVPR,
              author = "Le Guen, Vincent and Thome, Nicolas",
              title = "Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Xu et al., "Structure Preserving Video Prediction", CVPR, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2018_CVPR,
              author = "Xu, Jingwei and Ni, Bingbing and Li, Zefan and Cheng, Shuo and Yang, Xiaokang",
              title = "Structure Preserving Video Prediction",
              booktitle = "CVPR",
              year = "2018"
          }
          
        Byeon et al., "Contextvp: Fully Context-Aware Video Prediction", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Byeon_2018_ECCV,
              author = "Byeon, Wonmin and Wang, Qin and Kumar Srivastava, Rupesh and Koumoutsakos, Petros",
              title = "Contextvp: Fully Context-Aware Video Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Cai et al., "Deep Video Generation, Prediction And Completion Of Human Action Sequences", ECCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Cai_2018_ECCV,
              author = "Cai, Haoye and Bai, Chunyan and Tai, Yu-Wing and Tang, Chi-Keung",
              title = "Deep Video Generation, Prediction And Completion Of Human Action Sequences",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Xu et al., "Video Prediction Via Selective Sampling", NeurIPS, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2018_NeurIPS,
              author = "Xu, Jingwei and Ni, Bingbing and Yang, Xiaokang",
              title = "Video Prediction Via Selective Sampling",
              booktitle = "NeurIPS",
              year = "2018"
          }
          
        Wichers et al., "Hierarchical Long-Term Video Prediction Without Supervision", ICML, 2018. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Wichers_2018_ICML,
              author = "Wichers, Nevan and Villegas, Ruben and Erhan, Dumitru and Lee, Honglak",
              title = "Hierarchical Long-Term Video Prediction Without Supervision",
              booktitle = "ICML",
              year = "2018"
          }
          
        Ying et al., "Better Guider Predicts Future Better: Difference Guided Generative Adversarial Networks", ACCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Ying_2018_ACCV,
              author = "Ying, Guohao and Zou, Yingtian and Wan, Lin and Hu, Yiming and Feng, Jiashi",
              editor = "Jawahar, C.V. and Li, Hongdong and Mori, Greg and Schindler, Konrad",
              title = "Better Guider Predicts Future Better: Difference Guided Generative Adversarial Networks",
              booktitle = "ACCV",
              year = "2018"
          }
          
        Ji et al., "Dynamic Visual Sequence Prediction With Motion Flow Networks", WACV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Ji_2018_WACV,
              author = "Ji, D. and Wei, Z. and Dunn, E. and Frahm, J. M.",
              booktitle = "WACV",
              title = "Dynamic Visual Sequence Prediction With Motion Flow Networks",
              year = "2018"
          }
          
        Villegas et al., "Learning To Generate Long-Term Future Via Hierarchical Prediction", ICML, 2017. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Villegas_2017_ICML,
              author = "Villegas, Ruben and Yang, Jimei and Zou, Yuliang and Sohn, Sungryull and Lin, Xunyu and Lee, Honglak",
              title = "Learning To Generate Long-Term Future Via Hierarchical Prediction",
              booktitle = "ICML",
              year = "2017"
          }
          
        Finn et al., "Unsupervised Learning For Physical Interaction Through Video Prediction", NeurIPS, 2016. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Finn_2016_NeurIPS,
              author = "Finn, Chelsea and Goodfellow, Ian and Levine, Sergey",
              title = "Unsupervised Learning For Physical Interaction Through Video Prediction",
              booktitle = "NeurIPS",
              year = "2016"
          }
          
        Butepage et al., "Deep Representation Learning For Human Motion Prediction And Classification", CVPR, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Butepage_2017_CVPR,
              author = "Butepage, Judith and Black, Michael J. and Kragic, Danica and Kjellstrom, Hedvig",
              title = "Deep Representation Learning For Human Motion Prediction And Classification",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Aliakbarian et al., "A Stochastic Conditioning Scheme for Diverse Human Motion Prediction", CVPR, 2020. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Aliakbarian_2020_CVPR,
              author = "Aliakbarian, Sadegh and Saleh, Fatemeh Sadat and Salzmann, Mathieu and Petersson, Lars and Gould, Stephen",
              title = "A Stochastic Conditioning Scheme for Diverse Human Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Cui et al., "Learning Dynamic Relationships for 3D Human Motion Prediction", CVPR, 2020. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Cui_2020_CVPR,
              author = "Cui, Qiongjie and Sun, Huaijiang and Yang, Fei",
              title = "Learning Dynamic Relationships for 3D Human Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Li et al., "Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2020_CVPR,
              author = "Li, Maosen and Chen, Siheng and Zhao, Yangheng and Zhang, Ya and Wang, Yanfeng and Tian, Qi",
              title = "Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Gopalakrishnan et al., "A Neural Temporal Model For Human Motion Prediction", CVPR, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Gopalakrishnan_2019_CVPR,
              author = "Gopalakrishnan, Anand and Mali, Ankur and Kifer, Dan and Giles, Lee and Ororbia, Alexander G.",
              title = "A Neural Temporal Model For Human Motion Prediction",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Liu et al., "Towards Natural And Accurate Future Motion Prediction Of Humans And Animals", CVPR, 2019. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Liu_2019_CVPR,
              author = "Liu, Zhenguang and Wu, Shuang and Jin, Shuyuan and Liu, Qi and Lu, Shijian and Zimmermann, Roger and Cheng, Li",
              title = "Towards Natural And Accurate Future Motion Prediction Of Humans And Animals",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Hernandez et al., "Human Motion Prediction Via Spatio-Temporal Inpainting", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Hernandez_2019_ICCV,
              author = "Hernandez, Alejandro and Gall, Jurgen and Moreno-Noguer, Francesc",
              title = "Human Motion Prediction Via Spatio-Temporal Inpainting",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Mao et al., "Learning Trajectory Dependencies For Human Motion Prediction", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Mao_2019_ICCV,
              author = "Mao, Wei and Liu, Miaomiao and Salzmann, Mathieu and Li, Hongdong",
              title = "Learning Trajectory Dependencies For Human Motion Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Wang et al., "Imitation Learning For Human Pose Prediction", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2019_ICCV,
              author = "Wang, Borui and Adeli, Ehsan and Chiu, Hsu-kuang and Huang, De-An and Niebles, Juan Carlos",
              title = "Imitation Learning For Human Pose Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Zhang et al., "Predicting 3D Human Dynamics From Video", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhang_2019_ICCV,
              author = "Zhang, Jason Y. and Felsen, Panna and Kanazawa, Angjoo and Malik, Jitendra",
              title = "Predicting 3D Human Dynamics From Video",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Chiu et al., "Action-Agnostic Human Pose Forecasting", WACV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Chiu_2019_WACV,
              author = "Chiu, H. and Adeli, E. and Wang, B. and Huang, D. and Niebles, J. C.",
              booktitle = "WACV",
              title = "Action-Agnostic Human Pose Forecasting",
              year = "2019"
          }
          
        Gui et al., "Few-Shot Human Motion Prediction Via Meta-Learning", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Gui_2018_ECCV,
              author = "Gui, Liang-Yan and Wang, Yu-Xiong and Ramanan, Deva and Moura, Jose M. F.",
              title = "Few-Shot Human Motion Prediction Via Meta-Learning",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Gui et al., "Adversarial Geometry-Aware Human Motion Prediction", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Gui_2018_ECCV_2,
              author = "Gui, Liang-Yan and Wang, Yu-Xiong and Liang, Xiaodan and Moura, Jose M. F.",
              title = "Adversarial Geometry-Aware Human Motion Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Gui et al., "Teaching Robots To Predict Human Motion", IROS, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Gui_2018_IROS,
              author = "Gui, L. and Zhang, K. and Wang, Y. and Liang, X. and Moura, J. M. F. and Veloso, M.",
              booktitle = "IROS",
              title = "Teaching Robots To Predict Human Motion",
              year = "2018"
          }
          
        Chao et al., "Forecasting Human Dynamics From Static Images", CVPR, 2017. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Chao_2017_CVPR,
              author = "Chao, Yu-Wei and Yang, Jimei and Price, Brian and Cohen, Scott and Deng, Jia",
              title = "Forecasting Human Dynamics From Static Images",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Martinez et al., "On Human Motion Prediction Using Recurrent Neural Networks", CVPR, 2017. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Martinez_2017_CVPR,
              author = "Martinez, Julieta and Black, Michael J. and Romero, Javier",
              title = "On Human Motion Prediction Using Recurrent Neural Networks",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Jain et al., "Structural-Rnn: Deep Learning On Spatio-Temporal Graphs", CVPR, 2016. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Jain_2016_CVPR,
              author = "Jain, Ashesh and Zamir, Amir R. and Savarese, Silvio and Saxena, Ashutosh",
              title = "Structural-Rnn: Deep Learning On Spatio-Temporal Graphs",
              booktitle = "CVPR",
              year = "2016"
          }
          
        Fragkiadaki et al., "Recurrent Network Models For Human Dynamics", ICCV, 2015. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Fragkiadaki_2015_ICCV,
              author = "Fragkiadaki, Katerina and Levine, Sergey and Felsen, Panna and Malik, Jitendra",
              title = "Recurrent Network Models For Human Dynamics",
              booktitle = "ICCV",
              year = "2015"
          }
          
      Bibtex
      @Article{Ionescu_2014_PAMI,
          author = "Ionescu, Catalin and Papava, Dragos and Olaru, Vlad and Sminchisescu, Cristian",
          title = "Human3.6M: Large Scale Datasets And Predictive Methods For 3D Human Sensing In Natural Environments",
          journal = "PAMI",
          volume = "36",
          number = "7",
          pages = "1325-1339",
          year = "2014"
      }
      
    Sports-1M link paper
    • Summary: A large-scale dataset of 1M sports videos with 487 classes
    • Applications: Video prediction, Action prediction
    • Data type and annotations: RGB, activity label
    • Task: Sport
      Used in papers
        Lu et al., "Flexible Spatio-Temporal Networks For Video Prediction", CVPR, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Lu_2017_CVPR,
              author = "Lu, Chaochao and Hirsch, Michael and Scholkopf, Bernhard",
              title = "Flexible Spatio-Temporal Networks For Video Prediction",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Bhattacharjee et al., "Temporal Coherency Based Criteria For Predicting Video Frames Using Deep Multi-Stage Generative Adversarial Networks", NeurIPS, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Bhattacharjee_2017_NeurIPS,
              author = "Bhattacharjee, Prateep and Das, Sukhendu",
              title = "Temporal Coherency Based Criteria For Predicting Video Frames Using Deep Multi-Stage Generative Adversarial Networks",
              booktitle = "NeurIPS",
              year = "2017"
          }
          
        Kong et al., "Deep Sequential Context Networks For Action Prediction", CVPR, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Kong_2017_CVPR,
              author = "Kong, Yu and Tao, Zhiqiang and Fu, Yun",
              title = "Deep Sequential Context Networks For Action Prediction",
              booktitle = "CVPR",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Karpathy_2014_CVPR,
          author = "Karpathy, Andrej and Toderici, George and Shetty, Sanketh and Leung, Thomas and Sukthankar, Rahul and Fei-Fei, Li",
          title = "Large-Scale Video Classification With Convolutional Neural Networks",
          year = "2014",
          booktitle = "CVPR"
      }
      
    Breakfast link paper
    • Summary: A dataset of 77 hours of a video recording showing 10 breakfast preparation actions performed by 52 subjects in 18 different locations
    • Applications: Action prediction
    • Data type and annotations: RGB, activity label, temporal segment
    • Task: Cooking
      Used in papers
        Gammulle et al., "Forecasting Future Action Sequences With Neural Memory Networks", BMVC, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Gammulle_2019_BMVC,
              author = "Gammulle, Harshala and Denman, Simon and Sridharan, Sridha and Fookes, Clinton",
              title = "Forecasting Future Action Sequences With Neural Memory Networks",
              year = "2019",
              booktitle = "BMVC"
          }
          
        Alati et al., "Help By Predicting What To Do", ICIP, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Alati_2019_ICIP,
              author = "Alati, E. and Mauro, L. and Ntouskos, V. and Pirri, F.",
              booktitle = "ICIP",
              title = "Help By Predicting What To Do",
              year = "2019"
          }
          
        Abu et al., "When Will You Do What? - Anticipating Temporal Occurrences Of Activities", CVPR, 2018. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Farha_2018_CVPR,
              author = "Abu Farha, Yazan and Richard, Alexander and Gall, Juergen",
              title = "When Will You Do What? - Anticipating Temporal Occurrences Of Activities",
              booktitle = "CVPR",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Kuehne_2014_CVPR,
          author = "Kuehne, H. and Arslan, A. B. and Serre, T.",
          title = "The Language Of Actions: Recovering The Syntax And Semantics Of Goal-Directed Human Activities",
          booktitle = "CVPR",
          year = "2014"
      }
      
    Online RGBD Action Dataset (ORGBD) link paper
    • Summary: A dataset of RGBD sequences capturing 7 human-object interaction activities including drinking, eating, using a laptop, reading on a cellphone, etc.
    • Applications: Action prediction
    • Data type and annotations: RGBD, bounding box, 3D pose, activity label
    • Task: Activity
      Used in papers
        Hu et al., "Real-Time Rgb-D Activity Prediction By Soft Regression", ECCV, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Hu_2016_ECCV,
              author = "Hu, Jian-Fang and Zheng, Wei-Shi and Ma, Lianyang and Wang, Gang and Lai, Jianhuang",
              editor = "Leibe, Bastian and Matas, Jiri and Sebe, Nicu and Welling, Max",
              title = "Real-Time Rgb-D Activity Prediction By Soft Regression",
              booktitle = "ECCV",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Yu_2014_ACCV,
          author = "Yu, Gang and Liu, Zicheng and Yuan, Junsong",
          editor = "Cremers, Daniel and Reid, Ian and Saito, Hideo and Yang, Ming-Hsuan",
          title = "Discriminative Orderlet Mining For Real-Time Recognition Of Human-Object Interaction",
          booktitle = "ACCV",
          year = "2015"
      }
      
    MPII Human Pose link paper
    • Summary: A pose detection dataset with 25K images containing 40K subjects performing 410 different activities
    • Applications: Motion prediction
    • Data type and annotations: RGB, pose, activity label
    • Task: Activity
      Used in papers
        Chao et al., "Forecasting Human Dynamics From Static Images", CVPR, 2017. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Chao_2017_CVPR,
              author = "Chao, Yu-Wei and Yang, Jimei and Price, Brian and Cohen, Scott and Deng, Jia",
              title = "Forecasting Human Dynamics From Static Images",
              booktitle = "CVPR",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Andriluka_2014_CVPR,
          author = "Andriluka, Mykhaylo and Pishchulin, Leonid and Gehler, Peter and Schiele, Bernt",
          title = "2D Human Pose Estimation: New Benchmark And State Of The Art Analysis",
          booktitle = "CVPR",
          year = "2014"
      }
      

2013

↑ top
    Penn Action link paper
    • Summary: A dataset of 2.3K+ video clips of 15 actions with the corresponding human joint annotations
    • Applications: Video prediction, Motion prediction
    • Data type and annotations: RGB, bounding box, pose, activity label
    • Task: Activity
      Used in papers
        Ye et al., "Compositional Video Prediction", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Ye_2019_ICCV,
              author = "Ye, Yufei and Singh, Maneesh and Gupta, Abhinav and Tulsiani, Shubham",
              title = "Compositional Video Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Kim et al., "Unsupervised Keypoint Learning For Guiding Class-Conditional Video Prediction", NeurIPS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Kim_2019_NeurIPS,
              author = "Kim, Yunji and Nam, Seonghyeon and Cho, In and Kim, Seon Joo",
              title = "Unsupervised Keypoint Learning For Guiding Class-Conditional Video Prediction",
              booktitle = "NeurIPS",
              year = "2019"
          }
          
        Tang et al., "Pose Guided Global And Local Gan For Appearance Preserving Human Video Prediction", ICIP, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Tang_2019_ICIP,
              author = "Tang, J. and Hu, H. and Zhou, Q. and Shan, H. and Tian, C. and Quek, T. Q. S.",
              booktitle = "ICIP",
              title = "Pose Guided Global And Local Gan For Appearance Preserving Human Video Prediction",
              year = "2019"
          }
          
        Zhao et al., "Learning To Forecast And Refine Residual Motion For Image-To-Video Generation", ECCV, 2018. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhao_2018_ECCV,
              author = "Zhao, Long and Peng, Xi and Tian, Yu and Kapadia, Mubbasir and Metaxas, Dimitris",
              title = "Learning To Forecast And Refine Residual Motion For Image-To-Video Generation",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Walker et al., "The Pose Knows: Video Forecasting By Generating Pose Futures", ICCV, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Walker_2017_ICCV,
              author = "Walker, Jacob and Marino, Kenneth and Gupta, Abhinav and Hebert, Martial",
              title = "The Pose Knows: Video Forecasting By Generating Pose Futures",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Villegas et al., "Learning To Generate Long-Term Future Via Hierarchical Prediction", ICML, 2017. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Villegas_2017_ICML,
              author = "Villegas, Ruben and Yang, Jimei and Zou, Yuliang and Sohn, Sungryull and Lin, Xunyu and Lee, Honglak",
              title = "Learning To Generate Long-Term Future Via Hierarchical Prediction",
              booktitle = "ICML",
              year = "2017"
          }
          
        Zhang et al., "Predicting 3D Human Dynamics From Video", ICCV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhang_2019_ICCV,
              author = "Zhang, Jason Y. and Felsen, Panna and Kanazawa, Angjoo and Malik, Jitendra",
              title = "Predicting 3D Human Dynamics From Video",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Chiu et al., "Action-Agnostic Human Pose Forecasting", WACV, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Chiu_2019_WACV,
              author = "Chiu, H. and Adeli, E. and Wang, B. and Huang, D. and Niebles, J. C.",
              booktitle = "WACV",
              title = "Action-Agnostic Human Pose Forecasting",
              year = "2019"
          }
          
        Chao et al., "Forecasting Human Dynamics From Static Images", CVPR, 2017. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Chao_2017_CVPR,
              author = "Chao, Yu-Wei and Yang, Jimei and Price, Brian and Cohen, Scott and Deng, Jia",
              title = "Forecasting Human Dynamics From Static Images",
              booktitle = "CVPR",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Zhang_2013_ICCV,
          author = "Zhang, Weiyu and Zhu, Menglong and Derpanis, Konstantinos G",
          title = "From Actemes To Action: A Strongly-Supervised Representation For Detailed Action Understanding",
          booktitle = "ICCV",
          year = "2013"
      }
      
    Joint-Annotated Human Motion Data Base (JHMDB) link paper
    • Summary: A dataset of 928 video clips of 21 actions with corresponding flow maps and poses
    • Applications: Video prediction, Action prediction
    • Data type and annotations: RGB, mask, activity label, pose, optical flow
    • Task: Activity
      Used in papers
        Tang et al., "Pose Guided Global And Local Gan For Appearance Preserving Human Video Prediction", ICIP, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Tang_2019_ICIP,
              author = "Tang, J. and Hu, H. and Zhou, Q. and Shan, H. and Tian, C. and Quek, T. Q. S.",
              booktitle = "ICIP",
              title = "Pose Guided Global And Local Gan For Appearance Preserving Human Video Prediction",
              year = "2019"
          }
          
        Sun et al., "Relational Action Forecasting", CVPR, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sun_2019_CVPR,
              author = "Sun, Chen and Shrivastava, Abhinav and Vondrick, Carl and Sukthankar, Rahul and Murphy, Kevin and Schmid, Cordelia",
              title = "Relational Action Forecasting",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Zhao et al., "Spatiotemporal Feature Residual Propagation For Action Prediction", ICCV, 2019. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhao_2019_ICCV,
              author = "Zhao, He and Wildes, Richard P.",
              title = "Spatiotemporal Feature Residual Propagation For Action Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Shi et al., "Action Anticipation With Rbf Kernelized Feature Mapping Rnn", ECCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Shi_2018_ECCV,
              author = "Shi, Yuge and Fernando, Basura and Hartley, Richard",
              title = "Action Anticipation With Rbf Kernelized Feature Mapping Rnn",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Sadegh et al., "Encouraging Lstms To Anticipate Actions Very Early", ICCV, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Aliakbarian_2017_ICCV,
              author = "Sadegh Aliakbarian, Mohammad and Sadat Saleh, Fatemeh and Salzmann, Mathieu and Fernando, Basura and Petersson, Lars and Andersson, Lars",
              title = "Encouraging Lstms To Anticipate Actions Very Early",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Singh et al., "Online Real-Time Multiple Spatiotemporal Action Localisation And Prediction", ICCV, 2017. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Singh_2017_ICCV,
              author = "Singh, Gurkirt and Saha, Suman and Sapienza, Michael and Torr, Philip H. S. and Cuzzolin, Fabio",
              title = "Online Real-Time Multiple Spatiotemporal Action Localisation And Prediction",
              booktitle = "ICCV",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Jhuang_2013_ICCV,
          author = "Jhuang, H. and Gall, J. and Zuffi, S. and Schmid, C. and Black, M. J.",
          title = "Towards Understanding Action Recognition",
          booktitle = "ICCV",
          year = "2013"
      }
      
    CAD-120 link paper arxiv
    • Summary: A dataset of 120 RGBD videos of 10 daily activities performed by 4 subjects
    • Applications: Action prediction
    • Data type and annotations: RGBD, 3D pose, activity label, affordance label
    • Task: Activity
      Used in papers
        Alati et al., "Help By Predicting What To Do", ICIP, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Alati_2019_ICIP,
              author = "Alati, E. and Mauro, L. and Ntouskos, V. and Pirri, F.",
              booktitle = "ICIP",
              title = "Help By Predicting What To Do",
              year = "2019"
          }
          
        Schydlo et al., "Anticipation In Human-Robot Cooperation: A Recurrent Neural Network Approach For Multiple Action Sequences Prediction", ICRA, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Schydlo_2018_ICRA_2,
              author = "Schydlo, P. and Rakovic, M. and Jamone, L. and Santos-Victor, J.",
              booktitle = "ICRA",
              title = "Anticipation In Human-Robot Cooperation: A Recurrent Neural Network Approach For Multiple Action Sequences Prediction",
              year = "2018"
          }
          
        Qi et al., "Predicting Human Activities Using Stochastic Grammar", ICCV, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Qi_2017_ICCV,
              author = "Qi, Siyuan and Huang, Siyuan and Wei, Ping and Zhu, Song-Chun",
              title = "Predicting Human Activities Using Stochastic Grammar",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Jain et al., "Structural-Rnn: Deep Learning On Spatio-Temporal Graphs", CVPR, 2016. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Jain_2016_CVPR,
              author = "Jain, Ashesh and Zamir, Amir R. and Savarese, Silvio and Saxena, Ashutosh",
              title = "Structural-Rnn: Deep Learning On Spatio-Temporal Graphs",
              booktitle = "CVPR",
              year = "2016"
          }
          
        Hu et al., "Human Intent Forecasting Using Intrinsic Kinematic Constraints", IROS, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Hu_2016_IROS,
              author = "Hu, N. and Bestick, A. and Englebienne, G. and Bajscy, R. and Kröse, B.",
              booktitle = "IROS",
              title = "Human Intent Forecasting Using Intrinsic Kinematic Constraints",
              year = "2016"
          }
          
      Bibtex
      @Article{Koppula_2013_IJRR,
          author = "Koppula, Hema Swetha and Gupta, Rudhir and Saxena, Ashutosh",
          title = "Learning Human Activities And Object Affordances From Rgb-D Videos",
          journal = "IJRR",
          volume = "32",
          number = "8",
          pages = "951--970",
          year = "2013"
      }
      
    50Salads link paper
    • Summary: A dataset of 25 human subjects preparing 2 mixed salads each with 4h+ of annotated accelerometer and RGB-D video data recorded 50hz and 30hz respectively
    • Applications: Action prediction
    • Data type and annotations: RGBD, activity label, temporal segment, accelerometer
    • Task: Cooking (egocentric)
      Used in papers
        Ke et al., "Time-Conditioned Action Anticipation In One Shot", CVPR, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Ke_2019_CVPR,
              author = "Ke, Qiuhong and Fritz, Mario and Schiele, Bernt",
              title = "Time-Conditioned Action Anticipation In One Shot",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Gammulle et al., "Forecasting Future Action Sequences With Neural Memory Networks", BMVC, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Gammulle_2019_BMVC,
              author = "Gammulle, Harshala and Denman, Simon and Sridharan, Sridha and Fookes, Clinton",
              title = "Forecasting Future Action Sequences With Neural Memory Networks",
              year = "2019",
              booktitle = "BMVC"
          }
          
        Abu et al., "When Will You Do What? - Anticipating Temporal Occurrences Of Activities", CVPR, 2018. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Farha_2018_CVPR,
              author = "Abu Farha, Yazan and Richard, Alexander and Gall, Juergen",
              title = "When Will You Do What? - Anticipating Temporal Occurrences Of Activities",
              booktitle = "CVPR",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Stein_2013_IJCPUC,
          author = "Stein, Sebastian and McKenna, Stephen J",
          title = "Combining Embedded Accelerometers With Computer Vision For Recognizing Food Preparation Activities",
          booktitle = "UbiComp",
          year = "2013"
      }
      
    Daimler Path link paper
    • Summary: A dataset of 68 pedestrian sequences recorded using a dashboard camera inside a vehicle during stationary and mobile states
    • Applications: Action prediction
    • Data type and annotations: Stereo grayscale, bounding box, temporal segment, vehicle sensors
    • Task: Driving
      Used in papers
        Schulz et al., "Pedestrian Intention Recognition Using Latent-Dynamic Conditional Random Fields", IV, 2015. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Schulz_2015_IV,
              author = "Schulz, Andreas Th and Stiefelhagen, Rainer",
              title = "Pedestrian Intention Recognition Using Latent-Dynamic Conditional Random Fields",
              booktitle = "IV",
              year = "2015"
          }
          
        Schulz et al., "A Controlled Interactive Multiple Model Filter For Combined Pedestrian Intention Recognition And Path Prediction", ITSC, 2015. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Schulz_2015_ITSC,
              author = "Schulz, Andreas and Stiefelhagen, Rainer",
              title = "A Controlled Interactive Multiple Model Filter For Combined Pedestrian Intention Recognition And Path Prediction",
              booktitle = "ITSC",
              year = "2015"
          }
          
      Bibtex
      @InProceedings{Schneider_2013_GCPR,
          author = "Schneider, Nicolas and Gavrila, Dariu M",
          title = "Pedestrian Path Prediction With Recursive Bayesian Filters: A Comparative Study",
          booktitle = "GCPR",
          year = "2013"
      }
      
    CHUK Avenue link paper
    • Summary: A dataset of 37 video clips with 30K+ frames showing abnormal events
    • Applications: Video prediction, Trajectory prediction
    • Data type and annotations: RGB, bounding box, anomaly, temporal segment
    • Task: Surveillance, Anomaly
      Used in papers
        Kwon et al., "Predicting Future Frames Using Retrospective Cycle Gan", CVPR, 2019. paper
        Xu et al., "Encoding Crowd Interaction With Deep Neural Network For Pedestrian Trajectory Prediction", CVPR, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2018_CVPR_encoding,
              author = "Xu, Yanyu and Piao, Zhixin and Gao, Shenghua",
              title = "Encoding Crowd Interaction With Deep Neural Network For Pedestrian Trajectory Prediction",
              booktitle = "CVPR",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Lu_2013_ICCV,
          author = "Lu, Cewu and Shi, Jianping and Jia, Jiaya",
          title = "Abnormal Event Detection At 150 Fps In Matlab",
          booktitle = "ICCV",
          year = "2013"
      }
      
    ATC link paper
    • Summary: A dataset of human tracks recorded in a shopping mall for a period of 92 days using 3D range sensors
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, trajectory, attribute, depth
    • Task: Surveillance
      Used in papers
        Rudenko et al., "Joint Long-Term Prediction Of Human Motion Using A Planning-Based Social Force Approach", ICRA, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Rudenko_2018_ICRA,
              author = "Rudenko, A. and Palmieri, L. and Arras, K. O.",
              booktitle = "ICRA",
              title = "Joint Long-Term Prediction Of Human Motion Using A Planning-Based Social Force Approach",
              year = "2018"
          }
          
      Bibtex
      @Article{Brvsvcic_2013_HMS,
          author = "Br\vs\vci\'c, Dra{\v{z}}en and Kanda, Takayuki and Ikeda, Tetsushi and Miyashita, Takahiro",
          title = "Person Tracking In Large Public Spaces Using 3-D Range Sensors",
          journal = "Transactions on Human-Machine Systems",
          volume = "43",
          number = "6",
          pages = "522--534",
          year = "2013"
      }
      

2012

↑ top
    UCF-101 link arxiv
    • Summary: A large-scale dataset of 101 actions with 13K+ video clips divided into 5 groups of human-object interaction, body-motion only, human-human interaction, playing musical instruments, and sports
    • Applications: Video prediction, Action prediction, Motion prediction
    • Data type and annotations: RGB, activity label
    • Task: Activity
      Used in papers
        Kwon et al., "Predicting Future Frames Using Retrospective Cycle Gan", CVPR, 2019. paper
        Ho et al., "Sme-Net: Sparse Motion Estimation For Parametric Video Prediction Through Reinforcement Learning", ICCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Ho_2019_ICCV,
              author = "Ho, Yung-Han and Cho, Chuan-Yuan and Peng, Wen-Hsiao and Jin, Guo-Lun",
              title = "Sme-Net: Sparse Motion Estimation For Parametric Video Prediction Through Reinforcement Learning",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Zhang et al., "Looking-Ahead: Neural Future Video Frame Prediction", ICIP, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Zhang_2019_ICIP,
              author = "Zhang, C. and Chen, T. and Liu, H. and Shen, Q. and Ma, Z.",
              booktitle = "ICIP",
              title = "Looking-Ahead: Neural Future Video Frame Prediction",
              year = "2019"
          }
          
        Xu et al., "Structure Preserving Video Prediction", CVPR, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2018_CVPR,
              author = "Xu, Jingwei and Ni, Bingbing and Li, Zefan and Cheng, Shuo and Yang, Xiaokang",
              title = "Structure Preserving Video Prediction",
              booktitle = "CVPR",
              year = "2018"
          }
          
        Byeon et al., "Contextvp: Fully Context-Aware Video Prediction", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Byeon_2018_ECCV,
              author = "Byeon, Wonmin and Wang, Qin and Kumar Srivastava, Rupesh and Koumoutsakos, Petros",
              title = "Contextvp: Fully Context-Aware Video Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Cai et al., "Deep Video Generation, Prediction And Completion Of Human Action Sequences", ECCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Cai_2018_ECCV,
              author = "Cai, Haoye and Bai, Chunyan and Tai, Yu-Wing and Tang, Chi-Keung",
              title = "Deep Video Generation, Prediction And Completion Of Human Action Sequences",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Liu et al., "Dyan: A Dynamical Atoms-Based Network For Video Prediction", ECCV, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Liu_2018_ECCV,
              author = "Liu, Wenqian and Sharma, Abhishek and Camps, Octavia and Sznaier, Mario",
              title = "Dyan: A Dynamical Atoms-Based Network For Video Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Oliu et al., "Folded Recurrent Neural Networks For Future Video Prediction", ECCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Oliu_2018_ECCV,
              author = "Oliu, Marc and Selva, Javier and Escalera, Sergio",
              title = "Folded Recurrent Neural Networks For Future Video Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Bhattacharjee et al., "Predicting Video Frames Using Feature Based Locally Guided Objectives", ACCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Bhattacharjee_2018_ACCV,
              author = "Bhattacharjee, Prateep and Das, Sukhendu",
              editor = "Jawahar, C.V. and Li, Hongdong and Mori, Greg and Schindler, Konrad",
              title = "Predicting Video Frames Using Feature Based Locally Guided Objectives",
              booktitle = "ACCV",
              year = "2019"
          }
          
        Ying et al., "Better Guider Predicts Future Better: Difference Guided Generative Adversarial Networks", ACCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Ying_2018_ACCV,
              author = "Ying, Guohao and Zou, Yingtian and Wan, Lin and Hu, Yiming and Feng, Jiashi",
              editor = "Jawahar, C.V. and Li, Hongdong and Mori, Greg and Schindler, Konrad",
              title = "Better Guider Predicts Future Better: Difference Guided Generative Adversarial Networks",
              booktitle = "ACCV",
              year = "2018"
          }
          
        Lu et al., "Flexible Spatio-Temporal Networks For Video Prediction", CVPR, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Lu_2017_CVPR,
              author = "Lu, Chaochao and Hirsch, Michael and Scholkopf, Bernhard",
              title = "Flexible Spatio-Temporal Networks For Video Prediction",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Liang et al., "Dual Motion Gan For Future-Flow Embedded Video Prediction", ICCV, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Liang_2017_ICCV,
              author = "Liang, Xiaodan and Lee, Lisa and Dai, Wei and Xing, Eric P.",
              title = "Dual Motion Gan For Future-Flow Embedded Video Prediction",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Walker et al., "The Pose Knows: Video Forecasting By Generating Pose Futures", ICCV, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Walker_2017_ICCV,
              author = "Walker, Jacob and Marino, Kenneth and Gupta, Abhinav and Hebert, Martial",
              title = "The Pose Knows: Video Forecasting By Generating Pose Futures",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Bhattacharjee et al., "Temporal Coherency Based Criteria For Predicting Video Frames Using Deep Multi-Stage Generative Adversarial Networks", NeurIPS, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Bhattacharjee_2017_NeurIPS,
              author = "Bhattacharjee, Prateep and Das, Sukhendu",
              title = "Temporal Coherency Based Criteria For Predicting Video Frames Using Deep Multi-Stage Generative Adversarial Networks",
              booktitle = "NeurIPS",
              year = "2017"
          }
          
        Wang et al., "Progressive Teacher-Student Learning For Early Action Prediction", CVPR, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2019_CVPR,
              author = "Wang, Xionghui and Hu, Jian-Fang and Lai, Jian-Huang and Zhang, Jianguo and Zheng, Wei-Shi",
              title = "Progressive Teacher-Student Learning For Early Action Prediction",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Gammulle et al., "Predicting The Future: A Jointly Learnt Model For Action Anticipation", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Gammulle_2019_ICCV,
              author = "Gammulle, Harshala and Denman, Simon and Sridharan, Sridha and Fookes, Clinton",
              title = "Predicting The Future: A Jointly Learnt Model For Action Anticipation",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Zhao et al., "Spatiotemporal Feature Residual Propagation For Action Prediction", ICCV, 2019. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhao_2019_ICCV,
              author = "Zhao, He and Wildes, Richard P.",
              title = "Spatiotemporal Feature Residual Propagation For Action Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Safaei et al., "Still Image Action Recognition By Predicting Spatial-Temporal Pixel Evolution", WACV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Safaei_2019_WACV,
              author = "Safaei, M. and Foroosh, H.",
              booktitle = "WACV",
              title = "Still Image Action Recognition By Predicting Spatial-Temporal Pixel Evolution",
              year = "2019"
          }
          
        Chen et al., "Part-Activated Deep Reinforcement Learning For Action Prediction", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Chen_2018_ECCV,
              author = "Chen, Lei and Lu, Jiwen and Song, Zhanjie and Zhou, Jie",
              title = "Part-Activated Deep Reinforcement Learning For Action Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Shi et al., "Action Anticipation With Rbf Kernelized Feature Mapping Rnn", ECCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Shi_2018_ECCV,
              author = "Shi, Yuge and Fernando, Basura and Hartley, Richard",
              title = "Action Anticipation With Rbf Kernelized Feature Mapping Rnn",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Cho et al., "A Temporal Sequence Learning For Action Recognition And Prediction", WACV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Cho_2018_WACV,
              author = "Cho, S. and Foroosh, H.",
              booktitle = "WACV",
              title = "A Temporal Sequence Learning For Action Recognition And Prediction",
              year = "2018"
          }
          
        Kong et al., "Deep Sequential Context Networks For Action Prediction", CVPR, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Kong_2017_CVPR,
              author = "Kong, Yu and Tao, Zhiqiang and Fu, Yun",
              title = "Deep Sequential Context Networks For Action Prediction",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Sadegh et al., "Encouraging Lstms To Anticipate Actions Very Early", ICCV, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Aliakbarian_2017_ICCV,
              author = "Sadegh Aliakbarian, Mohammad and Sadat Saleh, Fatemeh and Salzmann, Mathieu and Fernando, Basura and Petersson, Lars and Andersson, Lars",
              title = "Encouraging Lstms To Anticipate Actions Very Early",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Singh et al., "Online Real-Time Multiple Spatiotemporal Action Localisation And Prediction", ICCV, 2017. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Singh_2017_ICCV,
              author = "Singh, Gurkirt and Saha, Suman and Sapienza, Michael and Torr, Philip H. S. and Cuzzolin, Fabio",
              title = "Online Real-Time Multiple Spatiotemporal Action Localisation And Prediction",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Xu et al., "Human Activities Prediction By Learning Combinatorial Sparse Representations", ICIP, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2016_ICIP,
              author = "Xu, K. and Qin, Z. and Wang, G.",
              booktitle = "ICIP",
              title = "Human Activities Prediction By Learning Combinatorial Sparse Representations",
              year = "2016"
          }
          
      Bibtex
      @Article{Soomro_2012_arxiv,
          author = "Soomro, Khurram and Zamir, Amir Roshan and Shah, Mubarak",
          title = "Ucf101: A Dataset Of 101 Human Actions Classes From Videos In The Wild",
          journal = "arXiv:1212.0402",
          year = "2012"
      }
      
    KITTI link paper
    • Summary: A large-scale driving dataset recorded with different modalities including stereo, LIDAR, GPS, etc. recorded at 10hz
    • Applications: Video prediction, Trajectory prediction, Other prediction
    • Data type and annotations: Stereo RGB, LIDAR, bounding box, optical flow, vehicle sensors, Tracking ID
    • Task: Driving
      Used in papers
        Jin et al., "Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Jin_2020_CVPR,
              author = "Jin, Beibei and Hu, Yu and Tang, Qiankun and Niu, Jingyu and Shi, Zhiping and Han, Yinhe and Li, Xiaowei",
              title = "Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Wu et al., "Future Video Synthesis With Object Motion Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Wu_2020_CVPR,
              author = "Wu, Yue and Gao, Rongrong and Park, Jaesik and Chen, Qifeng",
              title = "Future Video Synthesis With Object Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Kwon et al., "Predicting Future Frames Using Retrospective Cycle Gan", CVPR, 2019. paper
        Gao et al., "Disentangling Propagation And Generation For Video Prediction", ICCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Gao_2019_ICCV,
              author = "Gao, Hang and Xu, Huazhe and Cai, Qi-Zhi and Wang, Ruth and Yu, Fisher and Darrell, Trevor",
              title = "Disentangling Propagation And Generation For Video Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Ho et al., "Deep Reinforcement Learning For Video Prediction", ICIP, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Ho_2019_ICIP,
              author = "Ho, Y. and Cho, C. and Peng, W.",
              booktitle = "ICIP",
              title = "Deep Reinforcement Learning For Video Prediction",
              year = "2019"
          }
          
        Byeon et al., "Contextvp: Fully Context-Aware Video Prediction", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Byeon_2018_ECCV,
              author = "Byeon, Wonmin and Wang, Qin and Kumar Srivastava, Rupesh and Koumoutsakos, Petros",
              title = "Contextvp: Fully Context-Aware Video Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Liu et al., "Dyan: A Dynamical Atoms-Based Network For Video Prediction", ECCV, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Liu_2018_ECCV,
              author = "Liu, Wenqian and Sharma, Abhishek and Camps, Octavia and Sznaier, Mario",
              title = "Dyan: A Dynamical Atoms-Based Network For Video Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Bhattacharjee et al., "Predicting Video Frames Using Feature Based Locally Guided Objectives", ACCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Bhattacharjee_2018_ACCV,
              author = "Bhattacharjee, Prateep and Das, Sukhendu",
              editor = "Jawahar, C.V. and Li, Hongdong and Mori, Greg and Schindler, Konrad",
              title = "Predicting Video Frames Using Feature Based Locally Guided Objectives",
              booktitle = "ACCV",
              year = "2019"
          }
          
        Ying et al., "Better Guider Predicts Future Better: Difference Guided Generative Adversarial Networks", ACCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Ying_2018_ACCV,
              author = "Ying, Guohao and Zou, Yingtian and Wan, Lin and Hu, Yiming and Feng, Jiashi",
              editor = "Jawahar, C.V. and Li, Hongdong and Mori, Greg and Schindler, Konrad",
              title = "Better Guider Predicts Future Better: Difference Guided Generative Adversarial Networks",
              booktitle = "ACCV",
              year = "2018"
          }
          
        Jin et al., "Varnet: Exploring Variations For Unsupervised Video Prediction", IROS, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Jin_2018_IROS,
              author = "Jin, B. and Hu, Y. and Zeng, Y. and Tang, Q. and Liu, S. and Ye, J.",
              booktitle = "IROS",
              title = "Varnet: Exploring Variations For Unsupervised Video Prediction",
              year = "2018"
          }
          
        Liang et al., "Dual Motion Gan For Future-Flow Embedded Video Prediction", ICCV, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Liang_2017_ICCV,
              author = "Liang, Xiaodan and Lee, Lisa and Dai, Wei and Xing, Eric P.",
              title = "Dual Motion Gan For Future-Flow Embedded Video Prediction",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Bhattacharjee et al., "Temporal Coherency Based Criteria For Predicting Video Frames Using Deep Multi-Stage Generative Adversarial Networks", NeurIPS, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Bhattacharjee_2017_NeurIPS,
              author = "Bhattacharjee, Prateep and Das, Sukhendu",
              title = "Temporal Coherency Based Criteria For Predicting Video Frames Using Deep Multi-Stage Generative Adversarial Networks",
              booktitle = "NeurIPS",
              year = "2017"
          }
          
        Marchetti et al., "MANTRA: Memory Augmented Networks for Multiple Trajectory Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Marchetti_2020_CVPR,
              author = "Marchetti, Francesco and Becattini, Federico and Seidenari, Lorenzo and Del Bimbo, Alberto",
              title = "MANTRA: Memory Augmented Networks for Multiple Trajectory Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Srikanth et al., "Infer: Intermediate Representations For Future Prediction", IROS, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Srikanth_2019_IROS,
              author = "Srikanth, Shashank and Ansari, Junaid Ahmed and Sharma, Sarthak and others",
              booktitle = "IROS",
              title = "Infer: Intermediate Representations For Future Prediction",
              year = "2019"
          }
          
        Rhinehart et al., "R2P2: A Reparameterized Pushforward Policy For Diverse, Precise Generative Path Forecasting", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Rhinehart_2018_ECCV,
              author = "Rhinehart, Nicholas and Kitani, Kris M. and Vernaza, Paul",
              title = "R2P2: A Reparameterized Pushforward Policy For Diverse, Precise Generative Path Forecasting",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Lee et al., "Desire: Distant Future Prediction In Dynamic Scenes With Interacting Agents", CVPR, 2017. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Lee_2017_CVPR,
              author = "Lee, Namhoon and Choi, Wongun and Vernaza, Paul and Choy, Christopher B. and Torr, Philip H. S. and Chandraker, Manmohan",
              title = "Desire: Distant Future Prediction In Dynamic Scenes With Interacting Agents",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Mohajerin et al., "Multi-Step Prediction Of Occupancy Grid Maps With Recurrent Neural Networks", CVPR, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Mohajerin_2019_CVPR,
              author = "Mohajerin, Nima and Rohani, Mohsen",
              title = "Multi-Step Prediction Of Occupancy Grid Maps With Recurrent Neural Networks",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Guizilini et al., "Dynamic Hilbert Maps: Real-Time Occupancy Predictions In Changing Environments", ICRA, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Guizilini_2019_ICRA,
              author = "Guizilini, V. and Senanayake, R. and Ramos, F.",
              booktitle = "ICRA",
              title = "Dynamic Hilbert Maps: Real-Time Occupancy Predictions In Changing Environments",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Geiger_2012_CVPR,
          author = "Geiger, Andreas and Lenz, Philip and Urtasun, Raquel",
          title = "Are We Ready For Autonomous Driving? The Kitti Vision Benchmark Suite",
          booktitle = "CVPR",
          year = "2012"
      }
      
    New York Grand Central (GC) link paper
    • Summary: A trajectory dataset of pedestrians walking in the train station with 50K+ samples annotated at 25fps
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, trajectory
    • Task: Surveillance
      Used in papers
        Xu et al., "Encoding Crowd Interaction With Deep Neural Network For Pedestrian Trajectory Prediction", CVPR, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2018_CVPR_encoding,
              author = "Xu, Yanyu and Piao, Zhixin and Gao, Shenghua",
              title = "Encoding Crowd Interaction With Deep Neural Network For Pedestrian Trajectory Prediction",
              booktitle = "CVPR",
              year = "2018"
          }
          
        Yoo et al., "Visual Path Prediction In Complex Scenes With Crowded Moving Objects", CVPR, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Yoo_2016_CVPR,
              author = "Yoo, YoungJoon and Yun, Kimin and Yun, Sangdoo and Hong, JongHee and Jeong, Hawook and Young Choi, Jin",
              title = "Visual Path Prediction In Complex Scenes With Crowded Moving Objects",
              booktitle = "CVPR",
              year = "2016"
          }
          
        Yi et al., "Pedestrian Behavior Understanding And Prediction With Deep Neural Networks", ECCV, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Yi_2016_ECCV,
              author = "Yi, Shuai and Li, Hongsheng and Wang, Xiaogang",
              editor = "Leibe, Bastian and Matas, Jiri and Sebe, Nicu and Welling, Max",
              title = "Pedestrian Behavior Understanding And Prediction With Deep Neural Networks",
              booktitle = "ECCV",
              year = "2016"
          }
          
        Akbarzadeh et al., "Kernel Density Estimation For Target Trajectory Prediction", IROS, 2015. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Akbarzadeh_2015_IROS,
              author = "Akbarzadeh, V. and Gagné, C. and Parizeau, M.",
              booktitle = "IROS",
              title = "Kernel Density Estimation For Target Trajectory Prediction",
              year = "2015"
          }
          
      Bibtex
      @InProceedings{Zhou_2012_CVPR,
          author = "Zhou, Bolei and Wang, Xiaogang and Tang, Xiaoou",
          title = "Understanding Collective Crowd Behaviors: Learning A Mixture Model Of Dynamic Pedestrian-Agents",
          booktitle = "CVPR",
          year = "2012"
      }
      
    BIT link paper
    • Summary: A dataset of human interactions with 400 video clips capturing 8 different interaction classes
    • Applications: Action prediction
    • Data type and annotations: RGB, activity label
    • Task: Interaction
      Used in papers
        Zhao et al., "Spatiotemporal Feature Residual Propagation For Action Prediction", ICCV, 2019. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhao_2019_ICCV,
              author = "Zhao, He and Wildes, Richard P.",
              title = "Spatiotemporal Feature Residual Propagation For Action Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Chen et al., "Part-Activated Deep Reinforcement Learning For Action Prediction", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Chen_2018_ECCV,
              author = "Chen, Lei and Lu, Jiwen and Song, Zhanjie and Zhou, Jie",
              title = "Part-Activated Deep Reinforcement Learning For Action Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Kong et al., "Deep Sequential Context Networks For Action Prediction", CVPR, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Kong_2017_CVPR,
              author = "Kong, Yu and Tao, Zhiqiang and Fu, Yun",
              title = "Deep Sequential Context Networks For Action Prediction",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Lee et al., "Human Activity Prediction Based On Sub-Volume Relationship Descriptor", ICPR, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Lee_2016_ICPR,
              author = "Lee, Dong-Gyu and Lee, Seong-Whan",
              booktitle = "ICPR",
              title = "Human Activity Prediction Based On Sub-Volume Relationship Descriptor",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Kong_2012_ECCV,
          author = "Kong, Yu and Jia, Yunde and Fu, Yun",
          year = "2012",
          booktitle = "ECCV",
          title = "Learning Human Interaction By Interactive Phrases"
      }
      
    MPII Cooking link paper
    • Summary: A dataset of 65 cooking activities with 5.5K+ video clips recorded from 12 subjects
    • Applications: Action prediction
    • Data type and annotations: RGB, 3D pose, activity label, temporal segment
    • Task: Cooking
      Used in papers
        Alati et al., "Help By Predicting What To Do", ICIP, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Alati_2019_ICIP,
              author = "Alati, E. and Mauro, L. and Ntouskos, V. and Pirri, F.",
              booktitle = "ICIP",
              title = "Help By Predicting What To Do",
              year = "2019"
          }
          
        Mahmud et al., "Joint Prediction Of Activity Labels And Starting Times In Untrimmed Videos", ICCV, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Mahmud_2017_ICCV,
              author = "Mahmud, Tahmida and Hasan, Mahmudul and Roy-Chowdhury, Amit K.",
              title = "Joint Prediction Of Activity Labels And Starting Times In Untrimmed Videos",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Mahmud et al., "A Poisson Process Model For Activity Forecasting", ICIP, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Mahmud_2016_ICIP,
              author = "Mahmud, T. and Hasan, M. and Chakraborty, A. and Roy-Chowdhury, A. K.",
              booktitle = "ICIP",
              title = "A Poisson Process Model For Activity Forecasting",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Rohrbach_2012_CVPR,
          author = "Rohrbach, Marcus and Amin, Sikandar and Andriluka, Mykhaylo and Schiele, Bernt",
          title = "A Database For Fine Grained Activity Detection Of Cooking Activities",
          booktitle = "CVPR",
          year = "2012"
      }
      
    UvA-NEMO link paper
    • Summary: A large-scale dataset of smiles with 1240 video clips with both spontaneous and posed actions recorded at 50fps from 400 subjects with ages between 8 to 76 years
    • Applications: Video prediction
    • Data type and annotations: RGB
    • Task: Face (smile)
      Used in papers
        Kim et al., "Unsupervised Keypoint Learning For Guiding Class-Conditional Video Prediction", NeurIPS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Kim_2019_NeurIPS,
              author = "Kim, Yunji and Nam, Seonghyeon and Cho, In and Kim, Seon Joo",
              title = "Unsupervised Keypoint Learning For Guiding Class-Conditional Video Prediction",
              booktitle = "NeurIPS",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Dibeklio_2012_ECCV,
          author = "Dibeklio\uglu, Hamdi and Salah, Albert Ali and Gevers, Theo",
          editor = "Fitzgibbon, Andrew and Lazebnik, Svetlana and Perona, Pietro and Sato, Yoichi and Schmid, Cordelia",
          title = "Are You Really Smiling At Me? Spontaneous Versus Posed Enjoyment Smiles",
          booktitle = "ECCV",
          year = "2012"
      }
      
    UTKinect-Action (UTKA) link paper
    • Summary: A dataset of 10 basic actions, e.g. throwing, pushing, pulling, each performed by 10 subjects using a Kinect sensor recorded at 15fps
    • Applications: Action prediction
    • Data type and annotations: RGBD, 3D pose, activity label, temporal segment
    • Task: Activity
      Used in papers
        Kataoka et al., "Recognition Of Transitional Action For Short-Term Action Prediction Using Discriminative Temporal Cnn Feature", BMVC, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Kataoka_2016_BMVC,
              author = "Kataoka, Hirokatsu and Miyashita, Yudai and Hayashi, Masaki and Iwata, Kenji and Satoh, Yutaka",
              title = "Recognition Of Transitional Action For Short-Term Action Prediction Using Discriminative Temporal Cnn Feature",
              year = "2016",
              booktitle = "BMVC"
          }
          
      Bibtex
      @InProceedings{Xia_2012_CVPRW,
          author = "Xia, L. and Chen, C.C. and Aggarwal, JK",
          title = "View Invariant Human Action Recognition Using Histograms Of 3D Joints",
          booktitle = "CVPRW",
          year = "2012"
      }
      
    SBU Kinetic Interction (SBUKI) link paper
    • Summary: A dataset of 8 dyadic human interactions, e.g. approaching, departing, pushing, kicking, recorded from 7 participants using a Kinect sensor comprising a total of approx. 300 interactions
    • Applications: Action prediction, Trajectory prediction, Motion prediction
    • Data type and annotations: RGBD, 3D pose, activity label
    • Task: Interaction
      Used in papers
        Yao et al., "Multiple Granularity Group Interaction Prediction", CVPR, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Yao_2018_CVPR,
              author = "Yao, Taiping and Wang, Minsi and Ni, Bingbing and Wei, Huawei and Yang, Xiaokang",
              title = "Multiple Granularity Group Interaction Prediction",
              booktitle = "CVPR",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{kiwon_2012_CVPR,
          author = "Yun, Kiwon and Honorio, Jean and Chattopadhyay, Debaleena and Berg, Tamara L. and Samaras, Dimitris",
          title = "Two-Person Interaction Detection Using Body-Pose Features And Multiple Instance Learning",
          booktitle = "CVPRW",
          year = "2012"
      }
      
    MSR Daily Activity (MSRDA) link paper
    • Summary: A dataset of 16 activities, such as writing on a paper, using a laptop, using a vacuum cleaner, cheering up, etc., performed by 10 subjects, recording using a Kinect sensor
    • Applications: Action prediction
    • Data type and annotations: RGBD, activity label
    • Task: Activity
      Used in papers
        Zhang et al., "Bio-Inspired Predictive Orientation Decomposition Of Skeleton Trajectories For Real-Time Human Activity Prediction", ICRA, 2015. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Zhang_2015_ICRA,
              author = "Zhang, H. and Parker, L. E.",
              booktitle = "ICRA",
              title = "Bio-Inspired Predictive Orientation Decomposition Of Skeleton Trajectories For Real-Time Human Activity Prediction",
              year = "2015"
          }
          
      Bibtex
      @InProceedings{Wang_2012_CVPR,
          author = "Wang, Jiang and Liu, Zicheng and Wu, Ying and Yuan, Junsong",
          title = "Mining Actionlet Ensemble For Action Recognition With Depth Cameras",
          booktitle = "CVPR",
          year = "2012"
      }
      
    MANIAC link paper
    • Summary: An object manipulation action dataset with 8 different manipulation actions performed by 5 different subjects recorded using a Kinect sensor
    • Applications: Action prediction
    • Data type and annotations: RGBD, semantic segment, activity label
    • Task: Object interaction
      Used in papers
        Ziaeetabar et al., "Prediction Of Manipulation Action Classes Using Semantic Spatial Reasoning", IROS, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Ziaeetabar_2018_IROS,
              author = "Ziaeetabar, F. and Kulvicius, T. and Tamosiunaite, M. and Wörgötter, F.",
              booktitle = "IROS",
              title = "Prediction Of Manipulation Action Classes Using Semantic Spatial Reasoning",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Abramov_2012_WACV,
          author = "Abramov, Alexey and Pauwels, Karl and Papon, Jeremie and Worgotter, Florentin and Dellen, Babette",
          title = "Depth-Supported Real-Time Video Segmentation With The Kinect",
          booktitle = "WACV",
          year = "2012"
      }
      
    Georgia Tech Egocentric Activity Gaze (GTEA Gaze) link paper
    • Summary: An egocentric dataset of 17 cooking activity videos performed by 14 subjects
    • Applications: Action prediction
    • Data type and annotations: RGB, gaze, mask, activity label, temporal segment
    • Task: Cooking (egocentric)
      Used in papers
        Shen et al., "Egocentric Activity Prediction Via Event Modulated Attention", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Shen_2018_ECCV,
              author = "Shen, Yang and Ni, Bingbing and Li, Zefan and Zhuang, Ning",
              title = "Egocentric Activity Prediction Via Event Modulated Attention",
              booktitle = "ECCV",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Fathi_2012_ECCV,
          author = "Fathi, Alireza and Li, Yin and Rehg, James M",
          title = "Learning To Recognize Daily Actions Using Gaze",
          booktitle = "ECCV",
          year = "2012"
      }
      

2011

↑ top
    Town Center link paper
    • Summary: A dataset of surveillance recording of 2.2K pedestrians walking at the Oxford Town Center
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, bounding box
    • Task: Surveillance
      Used in papers
        Hasan et al., "Mx-Lstm: Mixing Tracklets And Vislets To Jointly Forecast Trajectories And Head Poses", CVPR, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Hasan_2018_CVPR,
              author = "Hasan, Irtiza and Setti, Francesco and Tsesmelis, Theodore and Del Bue, Alessio and Galasso, Fabio and Cristani, Marco",
              title = "Mx-Lstm: Mixing Tracklets And Vislets To Jointly Forecast Trajectories And Head Poses",
              booktitle = "CVPR",
              year = "2018"
          }
          
        Hasan et al., ""Seeing Is Believing": Pedestrian Trajectory Forecasting Using Visual Frustum Of Attention", WACV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Hasan_2018_WACV,
              author = "Hasan, I. and Setti, F. and Tsesmelis, T. and Del Bue, A. and Cristani, M. and Galasso, F.",
              booktitle = "WACV",
              title = "Seeing Is Believing": Pedestrian Trajectory Forecasting Using Visual Frustum Of Attention,
              year = "2018"
          }
          
        Ma et al., "Forecasting Interactive Dynamics Of Pedestrians With Fictitious Play", CVPR, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Ma_2017_CVPR,
              author = "Ma, Wei-Chiu and Huang, De-An and Lee, Namhoon and Kitani, Kris M.",
              title = "Forecasting Interactive Dynamics Of Pedestrians With Fictitious Play",
              booktitle = "CVPR",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Benfold_2011_CVPR,
          author = "Benfold, Ben and Reid, Ian",
          title = "Stable Multi-Target Tracking In Real-Time Surveillance Video",
          booktitle = "CVPR",
          year = "2011"
      }
      
    VIRAT link paper
    • Summary: A multiview dataset of 12 events, such as a person loading an object to a vehicle, a person opening a vehicle trunk, recorded in 11 scenes for a total of approx. 8.5 hours of video footage
    • Applications: Action prediction, Trajectory prediction
    • Data type and annotations: RGB, bounding box, activity label, temporal segment
    • Task: Surveillance, Activity
      Used in papers
        Mahmud et al., "Joint Prediction Of Activity Labels And Starting Times In Untrimmed Videos", ICCV, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Mahmud_2017_ICCV,
              author = "Mahmud, Tahmida and Hasan, Mahmudul and Roy-Chowdhury, Amit K.",
              title = "Joint Prediction Of Activity Labels And Starting Times In Untrimmed Videos",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Vasquez, "Novel Planning-Based Algorithms For Human Motion Prediction", ICRA, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Vasquez_2016_ICRA,
              author = "Vasquez, D.",
              booktitle = "ICRA",
              title = "Novel Planning-Based Algorithms For Human Motion Prediction",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Oh_2011_CVPR,
          author = "Oh, Sangmin and Hoogs, Anthony and Perera, Amitha and Cuntoor, Naresh and Chen, Chia-Chih and Lee, Jong Taek and Mukherjee, Saurajit and Aggarwal, JK and Lee, Hyungtae and Davis, Larry and others",
          title = "A Large-Scale Benchmark Dataset For Event Recognition In Surveillance Video",
          booktitle = "CVPR",
          year = "2011"
      }
      
    Stanford-40 link paper
    • Summary: A dataset of 40 actions with 9.5K+ RGB images and the corresponding bounding boxes around actors
    • Applications: Action prediction
    • Data type and annotations: RGB (image), bounding box, activity label
    • Task: Activity
      Used in papers
        Safaei et al., "Still Image Action Recognition By Predicting Spatial-Temporal Pixel Evolution", WACV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Safaei_2019_WACV,
              author = "Safaei, M. and Foroosh, H.",
              booktitle = "WACV",
              title = "Still Image Action Recognition By Predicting Spatial-Temporal Pixel Evolution",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Yao_2011_ICCV,
          author = "Yao, Bangpeng and Jiang, Xiaoye and Khosla, Aditya and Lin, Andy Lai and Guibas, Leonidas and Fei-Fei, Li",
          title = "Human Action Recognition By Learning Bases Of Action Attributes And Parts",
          booktitle = "ICCV",
          year = "2011"
      }
      
    Human Motion Database (HMDB) link paper
    • Summary: A dataset of 6.8K+ video clips of 51 actions corresponding to general facial actions (laughing), facial actions with object manipulation (smoking), general body movements (clapping hands), body movements with object interaction (catching), and body movements for human interaction (fencing)
    • Applications: Action prediction
    • Data type and annotations: RGB, bounding box, mask, activity label, attribute
    • Task: Activity
      Used in papers
        Cho et al., "A Temporal Sequence Learning For Action Recognition And Prediction", WACV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Cho_2018_WACV,
              author = "Cho, S. and Foroosh, H.",
              booktitle = "WACV",
              title = "A Temporal Sequence Learning For Action Recognition And Prediction",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Kuehne_2011_ICCV,
          author = "Kuehne, H. and Jhuang, H. and Garrote, E. and Poggio, T. and Serre, T.",
          title = "Hmdb: A Large Video Database For Human Motion Recognition",
          booktitle = "ICCV",
          year = "2011"
      }
      
    Ford Campus Vision LiDAR (FCVL) link paper
    • Summary: A dataset of LIDAR scans and IMU readings with the corresponding images collected using a Ford F-250 autonomous pickup truck with approx. 200 GB of data
    • Applications: Other prediction
    • Data type and annotations: RGB, LIDAR, vehicle sensors
    • Task: Driving
      Used in papers
        Choi et al., "Robust Modeling And Prediction In Dynamic Environments Using Recurrent Flow Networks", IROS, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Choi_2016_IROS,
              author = "Choi, S. and Lee, K. and Oh, S.",
              booktitle = "IROS",
              title = "Robust Modeling And Prediction In Dynamic Environments Using Recurrent Flow Networks",
              year = "2016"
          }
          
      Bibtex
      @Article{Pandey_2011_IJRR,
          author = "Pandey, Gaurav and McBride, James R and Eustice, Ryan M",
          title = "Ford Campus Vision And Lidar Data Set",
          journal = "The International Journal of Robotics Research (IJRR)",
          volume = "30",
          number = "13",
          pages = "1543--1552",
          year = "2011"
      }
      

2010

↑ top
    UT Interaction (UTI) link
    • Summary: A dataset of 6 human-human interactions, such as shaking hands, hugging, with 20 video clips of subjects with different clothing items recorded at 30fps
    • Applications: Action prediction
    • Data type and annotations: RGB, bounding box, activity label, temporal segment
    • Task: Interaction
      Used in papers
        Gammulle et al., "Predicting The Future: A Jointly Learnt Model For Action Anticipation", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Gammulle_2019_ICCV,
              author = "Gammulle, Harshala and Denman, Simon and Sridharan, Sridha and Fookes, Clinton",
              title = "Predicting The Future: A Jointly Learnt Model For Action Anticipation",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Chen et al., "Part-Activated Deep Reinforcement Learning For Action Prediction", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Chen_2018_ECCV,
              author = "Chen, Lei and Lu, Jiwen and Song, Zhanjie and Zhou, Jie",
              title = "Part-Activated Deep Reinforcement Learning For Action Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Shi et al., "Action Anticipation With Rbf Kernelized Feature Mapping Rnn", ECCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Shi_2018_ECCV,
              author = "Shi, Yuge and Fernando, Basura and Hartley, Richard",
              title = "Action Anticipation With Rbf Kernelized Feature Mapping Rnn",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Sadegh et al., "Encouraging Lstms To Anticipate Actions Very Early", ICCV, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Aliakbarian_2017_ICCV,
              author = "Sadegh Aliakbarian, Mohammad and Sadat Saleh, Fatemeh and Salzmann, Mathieu and Fernando, Basura and Petersson, Lars and Andersson, Lars",
              title = "Encouraging Lstms To Anticipate Actions Very Early",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Xu et al., "Human Activities Prediction By Learning Combinatorial Sparse Representations", ICIP, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2016_ICIP,
              author = "Xu, K. and Qin, Z. and Wang, G.",
              booktitle = "ICIP",
              title = "Human Activities Prediction By Learning Combinatorial Sparse Representations",
              year = "2016"
          }
          
        Lee et al., "Human Activity Prediction Based On Sub-Volume Relationship Descriptor", ICPR, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Lee_2016_ICPR,
              author = "Lee, Dong-Gyu and Lee, Seong-Whan",
              booktitle = "ICPR",
              title = "Human Activity Prediction Based On Sub-Volume Relationship Descriptor",
              year = "2016"
          }
          
        Xu et al., "Activity Auto-Completion: Predicting Human Activities From Partial Videos", ICCV, 2015. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2015_ICCV,
              author = "Xu, Zhen and Qing, Laiyun and Miao, Jun",
              title = "Activity Auto-Completion: Predicting Human Activities From Partial Videos",
              booktitle = "ICCV",
              year = "2015"
          }
          
      Bibtex
      @Misc{Ryoo_2010_UT,
          author = "Ryoo, M. S. and Aggarwal, J. K.",
          title = "Ut-INteraction Dataset, Icpr Contest On Semantic Description Of HUman ACtivities (Sdha)",
          year = "2010",
          url = "http://cvrc.ece.utexas.edu/SDHA2010/Human\\\_Interaction.html"
      }
      
    TV Human Interaction (THI) link paper
    • Summary: A dataset of 300 video clips collected from 20+ different TV shows containing 4 interactions: handshakes, high fives, hugs, and kisses
    • Applications: Action prediction
    • Data type and annotations: RGB, bounding box, head pose, activity label
    • Task: Interaction
      Used in papers
        Gammulle et al., "Predicting The Future: A Jointly Learnt Model For Action Anticipation", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Gammulle_2019_ICCV,
              author = "Gammulle, Harshala and Denman, Simon and Sridharan, Sridha and Fookes, Clinton",
              title = "Predicting The Future: A Jointly Learnt Model For Action Anticipation",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Zhong et al., "Unsupervised Learning For Forecasting Action Representations", ICIP, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Zhong_2018_ICIP,
              author = "Zhong, Y. and Zheng, W.",
              booktitle = "ICIP",
              title = "Unsupervised Learning For Forecasting Action Representations",
              year = "2018"
          }
          
        Zeng et al., "Visual Forecasting By Imitating Dynamics In Natural Sequences", ICCV, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zeng_2017_ICCV,
              author = "Zeng, Kuo-Hao and Shen, William B. and Huang, De-An and Sun, Min and Carlos Niebles, Juan",
              title = "Visual Forecasting By Imitating Dynamics In Natural Sequences",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Gao et al., "Red: Reinforced Encoder-Decoder Networks For Action Anticipation", BMVC, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Gao_2017_BMVC,
              author = "Gao, Jiyang and Yang, Zhenheng and Nevatia, Ram",
              title = "Red: Reinforced Encoder-Decoder Networks For Action Anticipation",
              year = "2017",
              booktitle = "BMVC"
          }
          
        Vondrick et al., "Anticipating Visual Representations From Unlabeled Video", CVPR, 2016. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Vondrick_2016_CVPR_2,
              author = "Vondrick, Carl and Pirsiavash, Hamed and Torralba, Antonio",
              title = "Anticipating Visual Representations From Unlabeled Video",
              booktitle = "CVPR",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Patron_2010_BMVC,
          author = "Patron-Perez, Alonso and Marszalek, Marcin and Zisserman, Andrew and Reid, Ian D",
          title = "High Five: Recognising Human Interactions In Tv Shows",
          booktitle = "BMVC",
          year = "2010"
      }
      
    Willow Action link paper
    • Summary: A dataset of 7 actions, e.g. riding a bike, riding a horse, running, depicted in 968 RGB and grayscale images
    • Applications: Action prediction
    • Data type and annotations: RGB (image), Grayscale (image) activity label
    • Task: Activity
      Used in papers
        Safaei et al., "Still Image Action Recognition By Predicting Spatial-Temporal Pixel Evolution", WACV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Safaei_2019_WACV,
              author = "Safaei, M. and Foroosh, H.",
              booktitle = "WACV",
              title = "Still Image Action Recognition By Predicting Spatial-Temporal Pixel Evolution",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Delaitre_2010_BMVC,
          author = "Delaitre, V. and Laptev, I. and Sivic, J.",
          title = "Recognizing Human Actions In Still Images: A Study Of Bag-Of-Features And Part-Based Representations",
          booktitle = "BMVC",
          year = "2010"
      }
      
    ViSOR link paper
    • Summary: A repository of various surveillance footage of pedestrians in indoor and outdoor environments with 162 video clips and 1M+ frames
    • Applications: Video prediction
    • Data type and annotations: RGB, bounding box, pose, attribute
    • Task: Surveillance
      Used in papers
        Lu et al., "Flexible Spatio-Temporal Networks For Video Prediction", CVPR, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Lu_2017_CVPR,
              author = "Lu, Chaochao and Hirsch, Michael and Scholkopf, Bernhard",
              title = "Flexible Spatio-Temporal Networks For Video Prediction",
              booktitle = "CVPR",
              year = "2017"
          }
          
      Bibtex
      @Article{Vezzani_2010_MTA,
          author = "Vezzani, Roberto and Cucchiara, Rita",
          title = "Video Surveillance Online Repository (Visor): An Integrated Framework",
          journal = "Multimedia Tools and Applications",
          volume = "50",
          number = "2",
          pages = "359--380",
          year = "2010"
      }
      
    PROST link paper
    • Summary: A dataset of 4K+ frames for tracking objects in the presences of camera motion
    • Applications: Video prediction
    • Data type and annotations: RGB, bounding box
    • Task: Object
      Used in papers
        Lu et al., "Flexible Spatio-Temporal Networks For Video Prediction", CVPR, 2017. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Lu_2017_CVPR,
              author = "Lu, Chaochao and Hirsch, Michael and Scholkopf, Bernhard",
              title = "Flexible Spatio-Temporal Networks For Video Prediction",
              booktitle = "CVPR",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Santner_2010_CVPR,
          author = "Santner, Jakob and Leistner, Christian and Saffari, Amir and Pock, Thomas and Bischof, Horst",
          title = "Prost: Parallel Robust Online Simple Tracking",
          booktitle = "CVPR",
          year = "2010"
      }
      
    MUG link paper
    • Summary: A dataset of 86 human subjects performing 6 types of basic expressions including anger, disgust, fear, happiness, sadness, and surprise recorded at 19fps for a total of 1462 sequences
    • Applications: Video prediction
    • Data type and annotations: RGB, keypoints, motion label
    • Task: Face (expression)
      Used in papers
        Zhao et al., "Learning To Forecast And Refine Residual Motion For Image-To-Video Generation", ECCV, 2018. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhao_2018_ECCV,
              author = "Zhao, Long and Peng, Xi and Tian, Yu and Kapadia, Mubbasir and Metaxas, Dimitris",
              title = "Learning To Forecast And Refine Residual Motion For Image-To-Video Generation",
              booktitle = "ECCV",
              year = "2018"
          }
          
      Bibtex
      @Article{Aifanti_2010_WIAMIS,
          author = "Aifanti, Niki and Papachristou, Christos and Delopoulos, Anastasios",
          title = "The Mug Facial Expression Database",
          journal = "WIAMIS",
          year = "2010"
      }
      
    MSR link paper
    • Summary: A dataset 20 actions, such as high arm wave, horizontal arm wave, hammer, forward punch, recorded using a depth camera at 15fps for a total of 23K+ frames
    • Applications: Video prediction
    • Data type and annotations: Depth, activity label
    • Task: Activity
      Used in papers
        Wang et al., "Order Matters: Shuffling Sequence Generation For Video Prediction", BMVC, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2019_BMVC,
              author = "Wang, Junyan and Hu, Bingzhang and Long, Yang and Guan, Yu",
              title = "Order Matters: Shuffling Sequence Generation For Video Prediction",
              year = "2019",
              booktitle = "BMVC"
          }
          
      Bibtex
      @InProceedings{Li_2010_CVPRW,
          author = "Li, Wanqing and Zhang, Zhengyou and Liu, Zicheng",
          title = "Action Recognition Based On A Bag Of 3D Points",
          booktitle = "CVPRW",
          year = "2010"
      }
      
    DIPLECS link paper
    • Summary: A dataset of 3.5 hours of driving with the corresponding steering angle computed based on a marker on the steering wheel
    • Applications: Other prediction
    • Data type and annotations: RGB, vehicle sensors
    • Task: Driving
      Used in papers
        He et al., "Aggregated Sparse Attention For Steering Angle Prediction", ICPR, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{He_2018_ICPR,
              author = "He, S. and Kangin, D. and Mi, Y. and Pugeault, N.",
              booktitle = "ICPR",
              title = "Aggregated Sparse Attention For Steering Angle Prediction",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Pugeault_2010_ECCV,
          author = "Pugeault, Nicolas and Bowden, Richard",
          title = "Learning Pre-Attentive Driving Behaviour From Holistic Visual Features",
          booktitle = "ECCV",
          year = "2010"
      }
      
    Taxi BJ link paper arxiv
    • Summary: A dataset of GPS data collected from 10K+ taxis in Beijing with a sampling rate of every 117 seconds
    • Applications: Video prediction
    • Data type and annotations: GPS
    • Task: Driving
      Used in papers
        Le et al., "Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Guen_2020_CVPR,
              author = "Le Guen, Vincent and Thome, Nicolas",
              title = "Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
      Bibtex
      @inproceedings{Yuan_2010_ICAGIS,
          author = "Yuan, Jing and Zheng, Yu and Zhang, Chengyang and Xie, Wenlei and Xie, Xing and Sun, Guangzhong and Huang, Yan",
          title = "T-Drive: Driving Directions Based on Taxi Trajectories",
          booktitle = "International Conference on Advances in Geographic Information Systems",
          pages = "99--108",
          year = "2010"
      }
      

2009

↑ top
    ETH link paper
    • Summary: A dataset of pedestrian trajectory with 650 tracks in 25+ minutes of video footage
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, trajectory
    • Task: Surveillance
      Used in papers
        Fang et al., "TPNet: Trajectory Proposal Network for Motion Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Fang_2020_CVPR,
              author = "Fang, Liangji and Jiang, Qinhong and Shi, Jianping and Zhou, Bolei",
              title = "TPNet: Trajectory Proposal Network for Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Hu et al., "Collaborative Motion Prediction via Neural Motion Message Passing", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Hu_2020_CVPR,
              author = "Hu, Yue and Chen, Siheng and Zhang, Ya and Gu, Xiao",
              title = "Collaborative Motion Prediction via Neural Motion Message Passing",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Sun et al., "Recursive Social Behavior Graph for Trajectory Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sun_2020_CVPR,
              author = "Sun, Jianhua and Jiang, Qinhong and Lu, Cewu",
              title = "Recursive Social Behavior Graph for Trajectory Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Sun et al., "Reciprocal Learning Networks for Human Trajectory Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sun_2020_CVPR_2,
              author = "Sun, Hao and Zhao, Zhiqun and He, Zhihai",
              title = "Reciprocal Learning Networks for Human Trajectory Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Haddad et al., "Self-Growing Spatial Graph Networks for Pedestrian Trajectory Prediction", WACV, 2020. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Haddad_2020_WACV,
              author = "Haddad, Sirin and Lam, Siew-Kei",
              title = "Self-Growing Spatial Graph Networks for Pedestrian Trajectory Prediction",
              booktitle = "WACV",
              year = "2020"
          }
          
        Li, "Which Way Are You Going? Imitative Decision Learning For Path Forecasting In Dynamic Scenes", CVPR, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2019_CVPR,
              author = "Li, Yuke",
              title = "Which Way Are You Going? Imitative Decision Learning For Path Forecasting In Dynamic Scenes",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Liang et al., "Peeking Into The Future: Predicting Future Person Activities And Locations In Videos", CVPR, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Liang_2019_CVPR,
              author = "Liang, Junwei and Jiang, Lu and Niebles, Juan Carlos and Hauptmann, Alexander G. and Fei-Fei, Li",
              title = "Peeking Into The Future: Predicting Future Person Activities And Locations In Videos",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Sadeghian et al., "Sophie: An Attentive Gan For Predicting Paths Compliant To Social And Physical Constraints", CVPR, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sadeghian_2019_CVPR,
              author = "Sadeghian, Amir and Kosaraju, Vineet and Sadeghian, Ali and Hirose, Noriaki and Rezatofighi, Hamid and Savarese, Silvio",
              title = "Sophie: An Attentive Gan For Predicting Paths Compliant To Social And Physical Constraints",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Zhang et al., "Sr-Lstm: State Refinement For Lstm Towards Pedestrian Trajectory Prediction", CVPR, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhang_2019_CVPR,
              author = "Zhang, Pu and Ouyang, Wanli and Zhang, Pengfei and Xue, Jianru and Zheng, Nanning",
              title = "Sr-Lstm: State Refinement For Lstm Towards Pedestrian Trajectory Prediction",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Zhao et al., "Multi-Agent Tensor Fusion For Contextual Trajectory Prediction", CVPR, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zhao_2019_CVPR,
              author = "Zhao, Tianyang and Xu, Yifei and Monfort, Mathew and Choi, Wongun and Baker, Chris and Zhao, Yibiao and Wang, Yizhou and Wu, Ying Nian",
              title = "Multi-Agent Tensor Fusion For Contextual Trajectory Prediction",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Choi et al., "Looking To Relations For Future Trajectory Forecast", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Choi_2019_ICCV,
              author = "Choi, Chiho and Dariush, Behzad",
              title = "Looking To Relations For Future Trajectory Forecast",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Huang et al., "Stgat: Modeling Spatial-Temporal Interactions For Human Trajectory Prediction", ICCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Huang_2019_ICCV,
              author = "Huang, Yingfan and Bi, Huikun and Li, Zhaoxin and Mao, Tianlu and Wang, Zhaoqi",
              title = "Stgat: Modeling Spatial-Temporal Interactions For Human Trajectory Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Kosaraju et al., "Social-Bigat: Multimodal Trajectory Forecasting Using Bicycle-Gan And Graph Attention Networks", NeurIPS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Kosaraju_2019_NeurIPS,
              author = "Kosaraju, Vineet and Sadeghian, Amir and Mart\'\in-Mart\'\in, Roberto and Reid, Ian and Rezatofighi, Hamid and Savarese, Silvio",
              title = "Social-Bigat: Multimodal Trajectory Forecasting Using Bicycle-Gan And Graph Attention Networks",
              booktitle = "NeurIPS",
              year = "2019"
          }
          
        Anderson et al., "Stochastic Sampling Simulation For Pedestrian Trajectory Prediction", IROS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Anderson_2019_IROS,
              author = "Anderson, Cyrus and Du, Xiaoxiao and Vasudevan, Ram and Johnson-Roberson, Matthew",
              booktitle = "IROS",
              title = "Stochastic Sampling Simulation For Pedestrian Trajectory Prediction",
              year = "2019"
          }
          
        Li et al., "Conditional Generative Neural System For Probabilistic Trajectory Prediction", IROS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2019_IROS,
              author = "Li, Jiachen and Ma, Hengbo and Tomizuka, Masayoshi",
              booktitle = "IROS",
              title = "Conditional Generative Neural System For Probabilistic Trajectory Prediction",
              year = "2019"
          }
          
        Zhu et al., "Starnet: Pedestrian Trajectory Prediction Using Deep Neural Network In Star Topology", IROS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zhu_2019_IROS,
              author = "Zhu, Yanliang and Qian, Deheng and Ren, Dongchun and Xia, Huaxia",
              booktitle = "IROS",
              title = "Starnet: Pedestrian Trajectory Prediction Using Deep Neural Network In Star Topology",
              year = "2019"
          }
          
        Xue et al., "Location-Velocity Attention For Pedestrian Trajectory Prediction", WACV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xue_2019_WACV,
              author = "Xue, H. and Huynh, D. and Reynolds, M.",
              booktitle = "WACV",
              title = "Location-Velocity Attention For Pedestrian Trajectory Prediction",
              year = "2019"
          }
          
        Gupta et al., "Social Gan: Socially Acceptable Trajectories With Generative Adversarial Networks", CVPR, 2018. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Gupta_2018_CVPR,
              author = "Gupta, Agrim and Johnson, Justin and Fei-Fei, Li and Savarese, Silvio and Alahi, Alexandre",
              title = "Social Gan: Socially Acceptable Trajectories With Generative Adversarial Networks",
              booktitle = "CVPR",
              year = "2018"
          }
          
        Xu et al., "Encoding Crowd Interaction With Deep Neural Network For Pedestrian Trajectory Prediction", CVPR, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2018_CVPR_encoding,
              author = "Xu, Yanyu and Piao, Zhixin and Gao, Shenghua",
              title = "Encoding Crowd Interaction With Deep Neural Network For Pedestrian Trajectory Prediction",
              booktitle = "CVPR",
              year = "2018"
          }
          
        Fernando et al., "Gd-Gan: Generative Adversarial Networks For Trajectory Prediction And Group Detection In Crowds", ACCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Fernando_2018_ACCV,
              author = "Fernando, Tharindu and Denman, Simon and Sridharan, Sridha and Fookes, Clinton",
              editor = "Jawahar, C. V. and Li, Hongdong and Mori, Greg and Schindler, Konrad",
              title = "Gd-Gan: Generative Adversarial Networks For Trajectory Prediction And Group Detection In Crowds",
              booktitle = "ACCV",
              year = "2019"
          }
          
        Pfeiffer et al., "A Data-Driven Model For Interaction-Aware Pedestrian Motion Prediction In Object Cluttered Environments", ICRA, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Pfeiffer_2018_ICRA,
              author = "Pfeiffer, M. and Paolo, G. and Sommer, H. and Nieto, J. and Siegwart, R. and Cadena, C.",
              booktitle = "ICRA",
              title = "A Data-Driven Model For Interaction-Aware Pedestrian Motion Prediction In Object Cluttered Environments",
              year = "2018"
          }
          
        Vemula et al., "Social Attention: Modeling Attention In Human Crowds", ICRA, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Vemula_2018_ICRA,
              author = "Vemula, Anirudh and Muelling, Katharina and Oh, Jean",
              title = "Social Attention: Modeling Attention In Human Crowds",
              booktitle = "ICRA",
              year = "2018"
          }
          
        Xue et al., "Ss-Lstm: A Hierarchical Lstm Model For Pedestrian Trajectory Prediction", WACV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xue_2018_WACV,
              author = "Xue, H. and Huynh, D. Q. and Reynolds, M.",
              booktitle = "WACV",
              title = "Ss-Lstm: A Hierarchical Lstm Model For Pedestrian Trajectory Prediction",
              year = "2018"
          }
          
        Alahi et al., "Social Lstm: Human Trajectory Prediction In Crowded Spaces", CVPR, 2016. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Alahi_2016_CVPR,
              author = "Alahi, Alexandre and Goel, Kratarth and Ramanathan, Vignesh and Robicquet, Alexandre and Fei-Fei, Li and Savarese, Silvio",
              title = "Social Lstm: Human Trajectory Prediction In Crowded Spaces",
              booktitle = "CVPR",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Pellegrini_2009_ICCV,
          author = "Pellegrini, Stefano and Ess, Andreas and Schindler, Konrad and Van Gool, Luc",
          title = "You'Ll Never Walk Alone: Modeling Social Behavior For Multi-Target Tracking",
          booktitle = "ICCV",
          year = "2009"
      }
      
    Caltech Pedestrian link paper
    • Summary: A pedestrian detection dataset with 2.3K unique samples with approx. 10 hours of video footage recorded and annotated at 30hz
    • Applications: Video prediction, Action prediction
    • Data type and annotations: RGB, bounding box, Tracking ID
    • Task: Driving
      Used in papers
        Jin et al., "Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Jin_2020_CVPR,
              author = "Jin, Beibei and Hu, Yu and Tang, Qiankun and Niu, Jingyu and Shi, Zhiping and Han, Yinhe and Li, Xiaowei",
              title = "Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Kwon et al., "Predicting Future Frames Using Retrospective Cycle Gan", CVPR, 2019. paper
        Gao et al., "Disentangling Propagation And Generation For Video Prediction", ICCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Gao_2019_ICCV,
              author = "Gao, Hang and Xu, Huazhe and Cai, Qi-Zhi and Wang, Ruth and Yu, Fisher and Darrell, Trevor",
              title = "Disentangling Propagation And Generation For Video Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Ho et al., "Sme-Net: Sparse Motion Estimation For Parametric Video Prediction Through Reinforcement Learning", ICCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Ho_2019_ICCV,
              author = "Ho, Yung-Han and Cho, Chuan-Yuan and Peng, Wen-Hsiao and Jin, Guo-Lun",
              title = "Sme-Net: Sparse Motion Estimation For Parametric Video Prediction Through Reinforcement Learning",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Ho et al., "Deep Reinforcement Learning For Video Prediction", ICIP, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Ho_2019_ICIP,
              author = "Ho, Y. and Cho, C. and Peng, W.",
              booktitle = "ICIP",
              title = "Deep Reinforcement Learning For Video Prediction",
              year = "2019"
          }
          
        Byeon et al., "Contextvp: Fully Context-Aware Video Prediction", ECCV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Byeon_2018_ECCV,
              author = "Byeon, Wonmin and Wang, Qin and Kumar Srivastava, Rupesh and Koumoutsakos, Petros",
              title = "Contextvp: Fully Context-Aware Video Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Liu et al., "Dyan: A Dynamical Atoms-Based Network For Video Prediction", ECCV, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Liu_2018_ECCV,
              author = "Liu, Wenqian and Sharma, Abhishek and Camps, Octavia and Sznaier, Mario",
              title = "Dyan: A Dynamical Atoms-Based Network For Video Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Reda et al., "Sdc-Net: Video Prediction Using Spatially-Displaced Convolution", ECCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Reda_2018_ECCV,
              author = "Reda, Fitsum A. and Liu, Guilin and Shih, Kevin J. and Kirby, Robert and Barker, Jon and Tarjan, David and Tao, Andrew and Catanzaro, Bryan",
              title = "Sdc-Net: Video Prediction Using Spatially-Displaced Convolution",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Liang et al., "Dual Motion Gan For Future-Flow Embedded Video Prediction", ICCV, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Liang_2017_ICCV,
              author = "Liang, Xiaodan and Lee, Lisa and Dai, Wei and Xing, Eric P.",
              title = "Dual Motion Gan For Future-Flow Embedded Video Prediction",
              booktitle = "ICCV",
              year = "2017"
          }
          
        Hariyono et al., "Estimation Of Collision Risk For Improving Driver'S Safety", IECON, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Hariyono_2016_IES,
              author = "Hariyono, Joko and Shahbaz, Ajmal and Kurnianggoro, Laksono and Jo, Kang-Hyun",
              title = "Estimation Of Collision Risk For Improving Driver'S Safety",
              booktitle = "IECON",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Dollar_2009_CVPR,
          author = "Doll\'ar, P. and Wojek, C. and Schiele, B. and Perona, P.",
          title = "Pedestrian Detection: A Benchmark",
          booktitle = "CVPR",
          year = "2009"
      }
      
    YUV Videos link
    • Summary: A collection of color video clips with different subjects and resolutions
    • Applications: Video prediction
    • Data type and annotations: RGB
    • Task: Mix videos
      Used in papers
        Ho et al., "Sme-Net: Sparse Motion Estimation For Parametric Video Prediction Through Reinforcement Learning", ICCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Ho_2019_ICCV,
              author = "Ho, Yung-Han and Cho, Chuan-Yuan and Peng, Wen-Hsiao and Jin, Guo-Lun",
              title = "Sme-Net: Sparse Motion Estimation For Parametric Video Prediction Through Reinforcement Learning",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Ho et al., "Deep Reinforcement Learning For Video Prediction", ICIP, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Ho_2019_ICIP,
              author = "Ho, Y. and Cho, C. and Peng, W.",
              booktitle = "ICIP",
              title = "Deep Reinforcement Learning For Video Prediction",
              year = "2019"
          }
          
      Bibtex
      @Misc{ASU_2009_YUV,
          author = "Library, ASU Video Trace",
          title = "Yuv Video Sequences",
          year = "2009",
          howpublished = "http://trace.kom.aau.dk/yuv/index.html"
      }
      
    Edinburgh Informatics Forum Pedestrian (EIFP) link paper
    • Summary: A dataset of 92K+ trajectories recorded with a top-down view camera capturing people walking inside a campus area for a period of several month
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, bounding box, Tracking ID
    • Task: Surveillance
      Used in papers
        Carvalho et al., "Long-Term Prediction Of Motion Trajectories Using Path Homology Clusters", IROS, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Carvalho_2019_IROS,
              author = "Carvalho, J Frederico and Vejdemo-Johansson, Mikael and Pokorny, Florian T and Kragic, Danica",
              booktitle = "IROS",
              title = "Long-Term Prediction Of Motion Trajectories Using Path Homology Clusters",
              year = "2019"
          }
          
        Zhi et al., "Kernel Trajectory Maps For Multi-Modal Probabilistic Motion Prediction", CoRL, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhi_2019_CORL,
              author = "Zhi, Weiming and Ott, Lionel and Ramos, Fabio",
              title = "Kernel Trajectory Maps For Multi-Modal Probabilistic Motion Prediction",
              booktitle = "CoRL",
              year = "2019"
          }
          
      Bibtex
      @mastersthesis{Majecka_2009,
          author = "Majecka, Barbara",
          title = "Statistical Models Of Pedestrian Behaviour In The Forum",
          school = "School of Informatics, University of Edinburgh",
          year = "2009"
      }
      
    TUM Kitchen link paper
    • Summary: A dataset of multiview video and multimodal sensor recordings of common activities in a kitchen environment containing 20 sequences
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, RFID, 3D pose, activity label, temporal segment
    • Task: Activity
      Used in papers
        Vo et al., "Augmenting Physical State Prediction Through Structured Activity Inference", ICRA, 2015. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Vo_2015_ICRA,
              author = "Vo, N. N. and Bobick, A. F.",
              booktitle = "ICRA",
              title = "Augmenting Physical State Prediction Through Structured Activity Inference",
              year = "2015"
          }
          
      Bibtex
      @InProceedings{Tenorth_2009_ICCVW,
          author = "Tenorth, Moritz and Bandouch, Jan and Beetz, Michael",
          title = "The Tum Kitchen Data Set Of Everyday Manipulation Activities For Motion Tracking And Action Recognition",
          booktitle = "ICCVW",
          year = "2009"
      }
      
    QMUL link paper
    • Summary: A dataset of surveillance footage of road traffic with 90K+ frames recorded at 25hz
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, trajectory
    • Task: Driving, Anomaly
      Used in papers
        Yoo et al., "Visual Path Prediction In Complex Scenes With Crowded Moving Objects", CVPR, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Yoo_2016_CVPR,
              author = "Yoo, YoungJoon and Yun, Kimin and Yun, Sangdoo and Hong, JongHee and Jeong, Hawook and Young Choi, Jin",
              title = "Visual Path Prediction In Complex Scenes With Crowded Moving Objects",
              booktitle = "CVPR",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Loy_2009_BMVC,
          author = "Loy, Chen Change and Xiang, Tao and Gong, Shaogang",
          title = "Modelling Multi-Object Activity By Gaussian Processes",
          booktitle = "BMVC",
          year = "2009"
      }
      
    PETS2009 link paper
    • Summary: A dataset of crowd activities, e.g. walking, running, evacuation (rapid dispersion), local dispersion, recorded from multiple views (up to 8) with different crows densities
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, bounding box
    • Task: Surveillance
      Used in papers
        Xue et al., "Location-Velocity Attention For Pedestrian Trajectory Prediction", WACV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xue_2019_WACV,
              author = "Xue, H. and Huynh, D. and Reynolds, M.",
              booktitle = "WACV",
              title = "Location-Velocity Attention For Pedestrian Trajectory Prediction",
              year = "2019"
          }
          
      Bibtex
      @InProceedings{Ferryman_2009_PETS,
          author = "Ferryman, J. and Shahrokni, A.",
          booktitle = "PETS",
          title = "Pets2009: Dataset And Challenge",
          year = "2009"
      }
      
    OSU link paper
    • Summary: A dataset of 20 videos, each with approx. 400 frames, of different football matches
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, bounding box, attribute, Tracking ID
    • Task: Sport
      Used in papers
        Lee et al., "Predicting Wide Receiver Trajectories In American Football", WACV, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Lee_2016_WACV,
              author = "Lee, N. and Kitani, K. M.",
              booktitle = "WACV",
              title = "Predicting Wide Receiver Trajectories In American Football",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Hess_2009_CVPR,
          author = "Hess, Rob and Fern, Alan",
          title = "Discriminatively Trained Particle Filters For Complex Multi-Object Tracking",
          booktitle = "CVPR",
          year = "2009"
      }
      
    Collective Activity (CA) link paper
    • Summary: A dataset of 40+ video clips showing collective activities including crossing, waiting, queueing, walking and talking
    • Applications: Action prediction, Trajectory prediction, Motion prediction
    • Data type and annotations: RGB, bounding box, attribute, activity label, temporal segment, pose
    • Task: Interaction
      Used in papers
        Yao et al., "Multiple Granularity Group Interaction Prediction", CVPR, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Yao_2018_CVPR,
              author = "Yao, Taiping and Wang, Minsi and Ni, Bingbing and Wei, Huawei and Yang, Xiaokang",
              title = "Multiple Granularity Group Interaction Prediction",
              booktitle = "CVPR",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Choi_2009_ICCVW,
          author = "Choi, Wongun and Shahid, Khuram and Savarese, Silvio",
          title = "What Are They Doing? : Collective Activity Classification Using Spatio-Temporal Relationship Among People",
          booktitle = "ICCVW",
          year = "2009"
      }
      

2008

↑ top
    MIT Trajectory (MITT) link paper
    • Summary: A dataset of 40K+ trajectories recorded from a parking lot for five days
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, trajectory
    • Task: Surveillance
      Used in papers
        Akbarzadeh et al., "Kernel Density Estimation For Target Trajectory Prediction", IROS, 2015. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Akbarzadeh_2015_IROS,
              author = "Akbarzadeh, V. and Gagné, C. and Parizeau, M.",
              booktitle = "IROS",
              title = "Kernel Density Estimation For Target Trajectory Prediction",
              year = "2015"
          }
          
      Bibtex
      @InProceedings{Grimson_2008_CVPR,
          author = "Grimson, Eric and Wang, Xiaogang and Ng, Gee-Wah and Ma, Keng Teck",
          title = "Trajectory Analysis And Semantic Region Modeling Using A Nonparametric Bayesian Model",
          booktitle = "CVPR",
          year = "2008"
      }
      
    Daimler link paper
    • Summary: A grayscale dataset of 70K+ pedestrian samples recorded during the course of 27 minutes of driving
    • Applications: Action prediction
    • Data type and annotations: Grayscale, bounding box
    • Task: Driving
      Used in papers
        Hariyono et al., "Estimation Of Collision Risk For Improving Driver'S Safety", IECON, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Hariyono_2016_IES,
              author = "Hariyono, Joko and Shahbaz, Ajmal and Kurnianggoro, Laksono and Jo, Kang-Hyun",
              title = "Estimation Of Collision Risk For Improving Driver'S Safety",
              booktitle = "IECON",
              year = "2016"
          }
          
      Bibtex
      @Article{Enzweiler_2008_PAMI,
          author = "Enzweiler, Markus and Gavrila, Dariu M",
          title = "Monocular Pedestrian Detection: Survey And Experiments",
          journal = "PAMI",
          volume = "31",
          number = "12",
          pages = "2179--2195",
          year = "2008"
      }
      

2007

↑ top
    UCY link paper
    • Summary: A dataset of surveillance videos capturing 900+ pedestrian trajectories in outdoor environments containing 4 videos
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, trajectory, gaze
    • Task: Surveillance
      Used in papers
        Fang et al., "TPNet: Trajectory Proposal Network for Motion Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Fang_2020_CVPR,
              author = "Fang, Liangji and Jiang, Qinhong and Shi, Jianping and Zhou, Bolei",
              title = "TPNet: Trajectory Proposal Network for Motion Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Hu et al., "Collaborative Motion Prediction via Neural Motion Message Passing", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Hu_2020_CVPR,
              author = "Hu, Yue and Chen, Siheng and Zhang, Ya and Gu, Xiao",
              title = "Collaborative Motion Prediction via Neural Motion Message Passing",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Sun et al., "Recursive Social Behavior Graph for Trajectory Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sun_2020_CVPR,
              author = "Sun, Jianhua and Jiang, Qinhong and Lu, Cewu",
              title = "Recursive Social Behavior Graph for Trajectory Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Sun et al., "Reciprocal Learning Networks for Human Trajectory Prediction", CVPR, 2020. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sun_2020_CVPR_2,
              author = "Sun, Hao and Zhao, Zhiqun and He, Zhihai",
              title = "Reciprocal Learning Networks for Human Trajectory Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Haddad et al., "Self-Growing Spatial Graph Networks for Pedestrian Trajectory Prediction", WACV, 2020. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Haddad_2020_WACV,
              author = "Haddad, Sirin and Lam, Siew-Kei",
              title = "Self-Growing Spatial Graph Networks for Pedestrian Trajectory Prediction",
              booktitle = "WACV",
              year = "2020"
          }
          
        Li, "Which Way Are You Going? Imitative Decision Learning For Path Forecasting In Dynamic Scenes", CVPR, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2019_CVPR,
              author = "Li, Yuke",
              title = "Which Way Are You Going? Imitative Decision Learning For Path Forecasting In Dynamic Scenes",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Liang et al., "Peeking Into The Future: Predicting Future Person Activities And Locations In Videos", CVPR, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Liang_2019_CVPR,
              author = "Liang, Junwei and Jiang, Lu and Niebles, Juan Carlos and Hauptmann, Alexander G. and Fei-Fei, Li",
              title = "Peeking Into The Future: Predicting Future Person Activities And Locations In Videos",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Sadeghian et al., "Sophie: An Attentive Gan For Predicting Paths Compliant To Social And Physical Constraints", CVPR, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Sadeghian_2019_CVPR,
              author = "Sadeghian, Amir and Kosaraju, Vineet and Sadeghian, Ali and Hirose, Noriaki and Rezatofighi, Hamid and Savarese, Silvio",
              title = "Sophie: An Attentive Gan For Predicting Paths Compliant To Social And Physical Constraints",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Zhang et al., "Sr-Lstm: State Refinement For Lstm Towards Pedestrian Trajectory Prediction", CVPR, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhang_2019_CVPR,
              author = "Zhang, Pu and Ouyang, Wanli and Zhang, Pengfei and Xue, Jianru and Zheng, Nanning",
              title = "Sr-Lstm: State Refinement For Lstm Towards Pedestrian Trajectory Prediction",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Zhao et al., "Multi-Agent Tensor Fusion For Contextual Trajectory Prediction", CVPR, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zhao_2019_CVPR,
              author = "Zhao, Tianyang and Xu, Yifei and Monfort, Mathew and Choi, Wongun and Baker, Chris and Zhao, Yibiao and Wang, Yizhou and Wu, Ying Nian",
              title = "Multi-Agent Tensor Fusion For Contextual Trajectory Prediction",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Choi et al., "Looking To Relations For Future Trajectory Forecast", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Choi_2019_ICCV,
              author = "Choi, Chiho and Dariush, Behzad",
              title = "Looking To Relations For Future Trajectory Forecast",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Huang et al., "Stgat: Modeling Spatial-Temporal Interactions For Human Trajectory Prediction", ICCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Huang_2019_ICCV,
              author = "Huang, Yingfan and Bi, Huikun and Li, Zhaoxin and Mao, Tianlu and Wang, Zhaoqi",
              title = "Stgat: Modeling Spatial-Temporal Interactions For Human Trajectory Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Thiede et al., "Analyzing The Variety Loss In The Context Of Probabilistic Trajectory Prediction", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Thiede_2019_ICCV,
              author = "Thiede, Luca Anthony and Brahma, Pratik Prabhanjan",
              title = "Analyzing The Variety Loss In The Context Of Probabilistic Trajectory Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Kosaraju et al., "Social-Bigat: Multimodal Trajectory Forecasting Using Bicycle-Gan And Graph Attention Networks", NeurIPS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Kosaraju_2019_NeurIPS,
              author = "Kosaraju, Vineet and Sadeghian, Amir and Mart\'\in-Mart\'\in, Roberto and Reid, Ian and Rezatofighi, Hamid and Savarese, Silvio",
              title = "Social-Bigat: Multimodal Trajectory Forecasting Using Bicycle-Gan And Graph Attention Networks",
              booktitle = "NeurIPS",
              year = "2019"
          }
          
        Anderson et al., "Stochastic Sampling Simulation For Pedestrian Trajectory Prediction", IROS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Anderson_2019_IROS,
              author = "Anderson, Cyrus and Du, Xiaoxiao and Vasudevan, Ram and Johnson-Roberson, Matthew",
              booktitle = "IROS",
              title = "Stochastic Sampling Simulation For Pedestrian Trajectory Prediction",
              year = "2019"
          }
          
        Li et al., "Conditional Generative Neural System For Probabilistic Trajectory Prediction", IROS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2019_IROS,
              author = "Li, Jiachen and Ma, Hengbo and Tomizuka, Masayoshi",
              booktitle = "IROS",
              title = "Conditional Generative Neural System For Probabilistic Trajectory Prediction",
              year = "2019"
          }
          
        Zhu et al., "Starnet: Pedestrian Trajectory Prediction Using Deep Neural Network In Star Topology", IROS, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zhu_2019_IROS,
              author = "Zhu, Yanliang and Qian, Deheng and Ren, Dongchun and Xia, Huaxia",
              booktitle = "IROS",
              title = "Starnet: Pedestrian Trajectory Prediction Using Deep Neural Network In Star Topology",
              year = "2019"
          }
          
        Xue et al., "Location-Velocity Attention For Pedestrian Trajectory Prediction", WACV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xue_2019_WACV,
              author = "Xue, H. and Huynh, D. and Reynolds, M.",
              booktitle = "WACV",
              title = "Location-Velocity Attention For Pedestrian Trajectory Prediction",
              year = "2019"
          }
          
        Gupta et al., "Social Gan: Socially Acceptable Trajectories With Generative Adversarial Networks", CVPR, 2018. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Gupta_2018_CVPR,
              author = "Gupta, Agrim and Johnson, Justin and Fei-Fei, Li and Savarese, Silvio and Alahi, Alexandre",
              title = "Social Gan: Socially Acceptable Trajectories With Generative Adversarial Networks",
              booktitle = "CVPR",
              year = "2018"
          }
          
        Hasan et al., "Mx-Lstm: Mixing Tracklets And Vislets To Jointly Forecast Trajectories And Head Poses", CVPR, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Hasan_2018_CVPR,
              author = "Hasan, Irtiza and Setti, Francesco and Tsesmelis, Theodore and Del Bue, Alessio and Galasso, Fabio and Cristani, Marco",
              title = "Mx-Lstm: Mixing Tracklets And Vislets To Jointly Forecast Trajectories And Head Poses",
              booktitle = "CVPR",
              year = "2018"
          }
          
        Xu et al., "Encoding Crowd Interaction With Deep Neural Network For Pedestrian Trajectory Prediction", CVPR, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Xu_2018_CVPR_encoding,
              author = "Xu, Yanyu and Piao, Zhixin and Gao, Shenghua",
              title = "Encoding Crowd Interaction With Deep Neural Network For Pedestrian Trajectory Prediction",
              booktitle = "CVPR",
              year = "2018"
          }
          
        Fernando et al., "Gd-Gan: Generative Adversarial Networks For Trajectory Prediction And Group Detection In Crowds", ACCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Fernando_2018_ACCV,
              author = "Fernando, Tharindu and Denman, Simon and Sridharan, Sridha and Fookes, Clinton",
              editor = "Jawahar, C. V. and Li, Hongdong and Mori, Greg and Schindler, Konrad",
              title = "Gd-Gan: Generative Adversarial Networks For Trajectory Prediction And Group Detection In Crowds",
              booktitle = "ACCV",
              year = "2019"
          }
          
        Vemula et al., "Social Attention: Modeling Attention In Human Crowds", ICRA, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Vemula_2018_ICRA,
              author = "Vemula, Anirudh and Muelling, Katharina and Oh, Jean",
              title = "Social Attention: Modeling Attention In Human Crowds",
              booktitle = "ICRA",
              year = "2018"
          }
          
        Hasan et al., ""Seeing Is Believing": Pedestrian Trajectory Forecasting Using Visual Frustum Of Attention", WACV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Hasan_2018_WACV,
              author = "Hasan, I. and Setti, F. and Tsesmelis, T. and Del Bue, A. and Cristani, M. and Galasso, F.",
              booktitle = "WACV",
              title = "Seeing Is Believing": Pedestrian Trajectory Forecasting Using Visual Frustum Of Attention,
              year = "2018"
          }
          
        Xue et al., "Ss-Lstm: A Hierarchical Lstm Model For Pedestrian Trajectory Prediction", WACV, 2018. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Xue_2018_WACV,
              author = "Xue, H. and Huynh, D. Q. and Reynolds, M.",
              booktitle = "WACV",
              title = "Ss-Lstm: A Hierarchical Lstm Model For Pedestrian Trajectory Prediction",
              year = "2018"
          }
          
        Bartoli et al., "Context-Aware Trajectory Prediction", ICPR, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Bartoli_2018_ICPR,
              author = "Bartoli, F. and Lisanti, G. and Ballan, L. and Del Bimbo, A.",
              booktitle = "ICPR",
              title = "Context-Aware Trajectory Prediction",
              year = "2018"
          }
          
        Ma et al., "Forecasting Interactive Dynamics Of Pedestrians With Fictitious Play", CVPR, 2017. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Ma_2017_CVPR,
              author = "Ma, Wei-Chiu and Huang, De-An and Lee, Namhoon and Kitani, Kris M.",
              title = "Forecasting Interactive Dynamics Of Pedestrians With Fictitious Play",
              booktitle = "CVPR",
              year = "2017"
          }
          
        Alahi et al., "Social Lstm: Human Trajectory Prediction In Crowded Spaces", CVPR, 2016. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Alahi_2016_CVPR,
              author = "Alahi, Alexandre and Goel, Kratarth and Ramanathan, Vignesh and Robicquet, Alexandre and Fei-Fei, Li and Savarese, Silvio",
              title = "Social Lstm: Human Trajectory Prediction In Crowded Spaces",
              booktitle = "CVPR",
              year = "2016"
          }
          
        Ballan et al., "Knowledge Transfer For Scene-Specific Motion Prediction", ECCV, 2016. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Ballan_2016_ECCV,
              author = "Ballan, Lamberto and Castaldo, Francesco and Alahi, Alexandre and Palmieri, Francesco and Savarese, Silvio",
              editor = "Leibe, Bastian and Matas, Jiri and Sebe, Nicu and Welling, Max",
              title = "Knowledge Transfer For Scene-Specific Motion Prediction",
              booktitle = "ECCV",
              year = "2016"
          }
          
      Bibtex
      @Article{Lerner_2007_CGF,
          author = "Lerner, Alon and Chrysanthou, Yiorgos and Lischinski, Dani",
          title = "Crowds By Example",
          journal = "Computer graphics forum",
          volume = "26",
          number = "3",
          pages = "655--664",
          year = "2007"
      }
      
    Next Generation simulationulation (NGSIM) link
    • Summary: A dataset of vehicle trajectories containing 10K+ frames of recording
    • Applications: Action prediction, Trajectory prediction
    • Data type and annotations: Map, trajectory
    • Task: Driving
      Used in papers
        Ding et al., "Predicting Vehicle Behaviors Over An Extended Horizon Using Behavior Interaction Network", ICRA, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Ding_2019_ICRA,
              author = "Ding, W. and Chen, J. and Shen, S.",
              booktitle = "ICRA",
              title = "Predicting Vehicle Behaviors Over An Extended Horizon Using Behavior Interaction Network",
              year = "2019"
          }
          
        Scheel et al., "Attention-Based Lane Change Prediction", ICRA, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Scheel_2019_ICRA,
              author = "Scheel, O. and Nagaraja, N. S. and Schwarz, L. and Navab, N. and Tombari, F.",
              booktitle = "ICRA",
              title = "Attention-Based Lane Change Prediction",
              year = "2019"
          }
          
        Scheel et al., "Situation Assessment For Planning Lane Changes: Combining Recurrent Models And Prediction", ICRA, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Scheel_2018_ICRA,
              author = "Scheel, O. and Schwarz, L. and Navab, N. and Tombari, F.",
              booktitle = "ICRA",
              title = "Situation Assessment For Planning Lane Changes: Combining Recurrent Models And Prediction",
              year = "2018"
          }
          
        Chandra et al., "Traphic: Trajectory Prediction In Dense And Heterogeneous Traffic Using Weighted Interactions", CVPR, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Chandra_2019_CVPR,
              author = "Chandra, Rohan and Bhattacharya, Uttaran and Bera, Aniket and Manocha, Dinesh",
              title = "Traphic: Trajectory Prediction In Dense And Heterogeneous Traffic Using Weighted Interactions",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Zhao et al., "Multi-Agent Tensor Fusion For Contextual Trajectory Prediction", CVPR, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Zhao_2019_CVPR,
              author = "Zhao, Tianyang and Xu, Yifei and Monfort, Mathew and Choi, Wongun and Baker, Chris and Zhao, Yibiao and Wang, Yizhou and Wu, Ying Nian",
              title = "Multi-Agent Tensor Fusion For Contextual Trajectory Prediction",
              booktitle = "CVPR",
              year = "2019"
          }
          
        Bi et al., "Joint Prediction For Kinematic Trajectories In Vehicle-Pedestrian-Mixed Scenes", ICCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Bi_2019_ICCV,
              author = "Bi, Huikun and Fang, Zhong and Mao, Tianlu and Wang, Zhaoqi and Deng, Zhigang",
              title = "Joint Prediction For Kinematic Trajectories In Vehicle-Pedestrian-Mixed Scenes",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Thiede et al., "Analyzing The Variety Loss In The Context Of Probabilistic Trajectory Prediction", ICCV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Thiede_2019_ICCV,
              author = "Thiede, Luca Anthony and Brahma, Pratik Prabhanjan",
              title = "Analyzing The Variety Loss In The Context Of Probabilistic Trajectory Prediction",
              booktitle = "ICCV",
              year = "2019"
          }
          
        Li et al., "Interaction-Aware Multi-Agent Tracking And Probabilistic Behavior Prediction Via Adversarial Learning", ICRA, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2019_ICRA,
              author = "Li, J. and Ma, H. and Tomizuka, M.",
              booktitle = "ICRA",
              title = "Interaction-Aware Multi-Agent Tracking And Probabilistic Behavior Prediction Via Adversarial Learning",
              year = "2019"
          }
          
        Tang et al., "Adaptive Probabilistic Vehicle Trajectory Prediction Through Physically Feasible Bayesian Recurrent Neural Network", ICRA, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Tang_2019_ICRA,
              author = "Tang, C. and Chen, J. and Tomizuka, M.",
              booktitle = "ICRA",
              title = "Adaptive Probabilistic Vehicle Trajectory Prediction Through Physically Feasible Bayesian Recurrent Neural Network",
              year = "2019"
          }
          
        Cho et al., "Deep Predictive Autonomous Driving Using Multi-Agent Joint Trajectory Prediction And Traffic Rules", IROS, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Cho_2019_IROS,
              author = "Cho, Kyunghoon and Ha, Timothy and Lee, Gunmin and Oh, Songhwai",
              booktitle = "IROS",
              title = "Deep Predictive Autonomous Driving Using Multi-Agent Joint Trajectory Prediction And Traffic Rules",
              year = "2019"
          }
          
      Bibtex
      @Misc{NGSIM_2007,
          author = "of Transporation, U.S. Department",
          Title = "Next Generation Simulation (Ngsim)",
          HowPublished = "Online",
          accessed = "2019-11-29",
          year = "2007"
      }
      
    Lankershim Boulevard link
    • Summary: A dataset of vehicle trajectories containing 30 minutes of data recorded at Lankershim Boulevard
    • Applications: Trajectory prediction
    • Data type and annotations: RGB, trajectory
    • Task: Driving
      Used in papers
        Zhi et al., "Kernel Trajectory Maps For Multi-Modal Probabilistic Motion Prediction", CoRL, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Zhi_2019_CORL,
              author = "Zhi, Weiming and Ott, Lionel and Ramos, Fabio",
              title = "Kernel Trajectory Maps For Multi-Modal Probabilistic Motion Prediction",
              booktitle = "CoRL",
              year = "2019"
          }
          
      Bibtex
      @Misc{US_2007_Lankershim,
          author = "of Transportation, U.S. Department",
          title = "Lankershim Boulevard Dataset",
          url = "https://www.fhwa.dot.gov/publications/research/operations/07029/index.cfm",
          year = "2007"
      }
      
    ETH Pedestrian link paper
    • Summary: A dataset of pedestrians recorded using a mobile platform with 5K+ frames span over 6 minutes
    • Applications: Action prediction
    • Data type and annotations: RGB, bounding box, Tracking ID
    • Task: Driving
      Used in papers
        Hariyono et al., "Estimation Of Collision Risk For Improving Driver'S Safety", IECON, 2016. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Hariyono_2016_IES,
              author = "Hariyono, Joko and Shahbaz, Ajmal and Kurnianggoro, Laksono and Jo, Kang-Hyun",
              title = "Estimation Of Collision Risk For Improving Driver'S Safety",
              booktitle = "IECON",
              year = "2016"
          }
          
      Bibtex
      @InProceedings{Ess_2007_ICCV,
          author = "Ess, Andreas and Leibe, Bastian and Van Gool, Luc",
          title = "Depth And Appearance For Mobile Scene Analysis",
          booktitle = "ICCV",
          year = "2007"
      }
      
    AMOS link paper
    • Summary: A dataset of 17M+ images captured every half hour during a period of 6 months from 538 outdoor webcams across the US
    • Applications: Other prediction
    • Data type and annotations: RGB, time, camera coordinate
    • Task: Weather
      Used in papers
        Chu et al., "Visual Weather Temperature Prediction", WACV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Chu_2018_WACV,
              author = "Chu, W. and Ho, K. and Borji, A.",
              booktitle = "WACV",
              title = "Visual Weather Temperature Prediction",
              year = "2018"
          }
          
      Bibtex
      @InProceedings{Jacobs_2007_CVPR,
          author = "Jacobs, Nathan and Roman, Nathaniel and Pless, Robert",
          title = "Consistent Temporal Variations In Many Outdoor Scenes",
          booktitle = "CVPR",
          year = "2007"
      }
      

2006

↑ top
    Tuscan, Arizona link paper
    • Summary: A dataset of wide-angle images of the sky with the corresponding temperature recorded for 7 months at 10 frames per minute rate with a total of approx. 1M images
    • Applications: Other prediction
    • Data type and annotations: RGB
    • Task: Weather
      Used in papers
        Siddiqui et al., "A Deep Learning Approach To Solar-Irradiance Forecasting In Sky-Videos", WACV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Siddiqui_2019_WACV,
              author = "Siddiqui, T. A. and Bharadwaj, S. and Kalyanaraman, S.",
              booktitle = "WACV",
              title = "A Deep Learning Approach To Solar-Irradiance Forecasting In Sky-Videos",
              year = "2019"
          }
          
      Bibtex
      @Article{Pickering_2006,
          author = "Pickering, TE",
          title = "The Mmt All-Sky Camera",
          journal = "Ground-based and Airborne Telescopes",
          volume = "6267",
          pages = "62671A",
          year = "2006"
      }
      

2004

↑ top
    KTH link paper
    • Summary: A dataset of 6 basic actions, e.g. walking, jogging, running, recorded from 25 subjects at 25fps for a total of 2391 sequences
    • Applications: Video prediction
    • Data type and annotations: Grayscale, activity label
    • Task: Activity
      Used in papers
        Jin et al., "Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction", CVPR, 2020. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Jin_2020_CVPR,
              author = "Jin, Beibei and Hu, Yu and Tang, Qiankun and Niu, Jingyu and Shi, Zhiping and Han, Yinhe and Li, Xiaowei",
              title = "Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Wang et al., "Probabilistic Video Prediction From Noisy Data With a Posterior Confidence", CVPR, 2020. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2020_CVPR,
              author = "Wang, Yunbo and Wu, Jiajun and Long, Mingsheng and Tenenbaum, Joshua B.",
              title = "Probabilistic Video Prediction From Noisy Data With a Posterior Confidence",
              booktitle = "CVPR",
              year = "2020"
          }
          
        Lee et al., "Mutual Suppression Network For Video Prediction Using Disentangled Features", BMVC, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Lee_2019_BMVC,
              author = "Lee, Jungbeom and Lee, Jangho and Lee, Sungmin and Yoon, Sungroh",
              title = "Mutual Suppression Network For Video Prediction Using Disentangled Features",
              year = "2019",
              booktitle = "BMVC"
          }
          
        Wang et al., "Order Matters: Shuffling Sequence Generation For Video Prediction", BMVC, 2019. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2019_BMVC,
              author = "Wang, Junyan and Hu, Bingzhang and Long, Yang and Guan, Yu",
              title = "Order Matters: Shuffling Sequence Generation For Video Prediction",
              year = "2019",
              booktitle = "BMVC"
          }
          
        Li et al., "Flow-Grounded Spatial-Temporal Video Prediction From Still Images", ECCV, 2018. paper arxiv code
          Datasets Metrics
          Bibtex
          @InProceedings{Li_2018_ECCV,
              author = "Li, Yijun and Fang, Chen and Yang, Jimei and Wang, Zhaowen and Lu, Xin and Yang, Ming-Hsuan",
              title = "Flow-Grounded Spatial-Temporal Video Prediction From Still Images",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Oliu et al., "Folded Recurrent Neural Networks For Future Video Prediction", ECCV, 2018. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Oliu_2018_ECCV,
              author = "Oliu, Marc and Selva, Javier and Escalera, Sergio",
              title = "Folded Recurrent Neural Networks For Future Video Prediction",
              booktitle = "ECCV",
              year = "2018"
          }
          
        Bhattacharjee et al., "Predicting Video Frames Using Feature Based Locally Guided Objectives", ACCV, 2019. paper
          Datasets Metrics
          Bibtex
          @InProceedings{Bhattacharjee_2018_ACCV,
              author = "Bhattacharjee, Prateep and Das, Sukhendu",
              editor = "Jawahar, C.V. and Li, Hongdong and Mori, Greg and Schindler, Konrad",
              title = "Predicting Video Frames Using Feature Based Locally Guided Objectives",
              booktitle = "ACCV",
              year = "2019"
          }
          
        Jin et al., "Varnet: Exploring Variations For Unsupervised Video Prediction", IROS, 2018. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Jin_2018_IROS,
              author = "Jin, B. and Hu, Y. and Zeng, Y. and Tang, Q. and Liu, S. and Ye, J.",
              booktitle = "IROS",
              title = "Varnet: Exploring Variations For Unsupervised Video Prediction",
              year = "2018"
          }
          
        Wang et al., "Predrnn: Recurrent Neural Networks For Predictive Learning Using Spatiotemporal Lstms", NeurIPS, 2017. paper code
          Datasets Metrics
          Bibtex
          @InProceedings{Wang_2017_NeurIPS,
              author = "Wang, Yunbo and Long, Mingsheng and Wang, Jianmin and Gao, Zhifeng and Yu, Philip S",
              title = "Predrnn: Recurrent Neural Networks For Predictive Learning Using Spatiotemporal Lstms",
              booktitle = "NeurIPS",
              year = "2017"
          }
          
      Bibtex
      @InProceedings{Schuldt_2004_ICPR,
          author = "Schuldt, Christian and Laptev, Ivan and Caputo, Barbara",
          title = "Recognizing Human Actions: A Local Svm Approach",
          booktitle = "ICPR",
          volume = "3",
          year = "2004"
      }
      

1981

↑ top
    Golden Colorado link
    • Summary: A dataset of wide-angle images of the sky with the corresponding temperature recorded for 12 years at 1 frame every 10 minutes 300K+ images
    • Applications: Other prediction
    • Data type and annotations: RGB
    • Task: Weather
      Used in papers
        Siddiqui et al., "A Deep Learning Approach To Solar-Irradiance Forecasting In Sky-Videos", WACV, 2019. paper arxiv
          Datasets Metrics
          Bibtex
          @InProceedings{Siddiqui_2019_WACV,
              author = "Siddiqui, T. A. and Bharadwaj, S. and Kalyanaraman, S.",
              booktitle = "WACV",
              title = "A Deep Learning Approach To Solar-Irradiance Forecasting In Sky-Videos",
              year = "2019"
          }
          
      Bibtex
      @techreport{Stoffel_1981,
          author = "Stoffel, T and Andreas, A",
          title = "Nrel Solar Radiation Research Laboratory (Srrl): Baseline Measurement System (Bms); Golden, Colorado (Data)",
          year = "1981",
          institution = "National Renewable Energy Lab.(NREL)"
      }