2019 I/ITSEC

Reinforcement Learning for Automated Textual Reasoning (Room 320GH)

04 Dec 19
10:30 AM - 11:00 AM

Tracks: Full Schedule, Wednesday Schedule

While many of the most popular machine learning algorithms (like convolutional and recurrent neural networks) date back decades, their practical realizations have spawned entirely new capabilities for the training community. Beyond just their breakthrough classification results for video, audio and imagery, the near-expert capability to understand natural language has surged with the late-2018 open-sourcing of Google's Deep Bidirectional Transformers for Language Understanding (BERT). As its name implies, BERT represents sentences and tokens bi-directionally, so context derives both left and right of each targeted meaning. The empirical results on 11 standard language tasks (such as next sentence prediction) surpass human experts. We investigate BERT's ability to answer questions in common language, to model topics and paraphrase large training documents. We test whether the underlying language model offers anything akin to a universal template, which practically means that its common architecture can pre-train on general domains and then quickly specialize to new, often-obscure technical domains like cybersecurity or non-English languages. This transfer learning feature offers a rich toolkit for future training and testing even in the absence of labeled data where supervised learning previously has seemed impossible. We finally apply our newly trained language model to the creation of scripted scenarios, or rule-bending approaches to derive novel variants of a previously known rehearsal narrative.  One concrete scripted example focuses on training military officers to negotiate successfully with non-combatants. We score this machine learning approach to generate entire negotiation and bargaining strategies depending on current human terrain and the opponents’ underlying motivation or interests. The results highlight three of the big concepts underlying deep learning: 1) transfer learning with fewer examples; 2) creative or adversarial generation of training data, and 3) reinforcement learning or gaming with just rules and rewards in the complete absence of examples.