Counterinsurgency Training by ‘Virtual Human’
Using artificial intelligence and the graphics techniques behind “Avatar,” a USC institute creates “virtual humans” and interactive immersions that train American soldiers to win hearts and minds in Iraq and Afghanistan.
The path to Bill Swartout‘s office hints that he’s involved in something very … high-concept. The ground-floor reception area, the elevator and the hallway leading to his office are decorated in a combination of brushed metal and sleek curves that seems to have both futuristic and retro influences. When I ask about the techno-deco look, Swartout mentions matter-of-factly that the interiors here were designed by Herman Zimmerman, a production designer who worked on Star Trek.
The office itself sits atop a six-story building in Marina del Rey and has a killer view over Los Angeles; on the day I visited, it was decked out with large, twin flat-panel computer screens, a laptop and a new iPad, among other tech toys. But the high-tech surroundings really aren’t much more than a hint of the fascinating, truly bleeding-edge reality of Swartout’s work. He and his boss, Randall Hill Jr., lead teams of computer scientists, graphics visionaries, artificial-intelligence wizards, social-science experts, digital game makers and Hollywood storytellers who are taking the notion of virtual reality to a new level of fidelity, creating immersive environments that, among other things, help America’s soldiers experience the culture of Iraq and Afghanistan before they go and treat them for post-traumatic stress when they return.
In the dry prose of the Web page bio, Hill is executive director and Swartout is director of technology for the University of Southern California’s Institute for Creative Technologies. There’s nothing dry about ICT’s work, though; the institute is at the forefront of creating “virtual humans” — that is, extremely lifelike animations that, through the near-magic of artificial-intelligence algorithms and a host of intertwined technologies, can respond in realistic ways to the speech and actions of actual human beings.
Most of the institute’s funding comes from the Army; in bureaucratese, it’s a university-affiliated research center, and it has an annual budget in the low tens of millions of dollars and a staff of more than 100. The bulk of its training simulations are aimed at the skills soldiers will need, from training through deployment, and the services they will require on return. But the institute’s research has also spawned applications that seem likely to have civilian appeal, from a “virtual patient” who helps doctors, psychological clinicians and social workers learn diagnostic skills to National Science Foundation-funded, twin-sister “virtual museum guides” who answer questions for visitors to a Boston science museum.
On the graphic-realism side of things, the institute is responsible for developing the Light Stage technology recently used to create animated characters with lifelike human facial expressions and movement patterns for the blockbuster 3-D film Avatar. (In fact, Paul Debevec, who leads the institute’s graphics laboratory, received a technical Academy Award for his work on the technology.) But the fascinating edge of ICT’s work doesn’t center on graphical realism alone. It includes the many technologies and artificial-intelligence strategies behind the virtual humans who inhabit these immersive simulations, and on the structure of the simulations themselves — which, increasingly, can make humans believe they are free to respond as if they were in the real world, with an endless number of choices available to them.
Among other aims, the institute is now looking to make virtual humans cheaper and, as cost comes down, “a lot more ubiquitous,” Swartout notes. In time, they will most certainly be used in many training and educational roles. (I find it hard to imagine they won’t eventually find a place teaching at least portions of the basic classes that dominate the freshman year at most universities.) And in terms of entertainment, video gamers won’t just be blasting their way through the bad guys in relatively circumscribed first-person-shooter scenarios. In not too long, Swartout says, they’ll also be talking to their games and “interacting with people who’re talking back.”
The Institute for Creative Technologies was founded in 1999 with a grant from the Army; its brief was to bring the technology expertise of USC together with the storytelling and other creative abilities of Hollywood and the video game world to make training simulations for the military. “We came to the conclusion that they had a pretty good handle on training someone to shoot a gun,” Swartout says, so the institute focused on the human dimension of military action. Though conceived long before the terror attacks of Sept. 11, 2001 and subsequent U.S. military action in the Middle East, the focus now seems almost prescient, with ICT simulations training soldiers in any number of immersive, highly realistic ways to engage with and win over the hearts and minds of the people of Iraq and Afghanistan.
There was obviously a learning curve to traverse on the way to the current crop of virtual humans. While explaining ICT’s history, Swartout showed an early simulation — created in 1998, before the Institute for Creative Technologies was formed — in which a sailor named “Steve” jerkily offers training for using a compressor on a Navy ship; the experience is about as realistic and immersive as the 1982 movie Tron.
But since then, the multiple technologies that combine in a virtual-reality presentation have advanced rapidly. In 2007, ICT took a non-animated character the Army had been using on its Web recruiting platform, Sgt. Star, and added then-cutting-edge video graphics, making him into a virtual human whose square-shouldered form, significant ego and macho sense of humor are projected onto a translucent screen that gives him a rounded, holographic appearance. Sgt. Star’s responses to questions are driven by artificial-intelligence algorithms, so he can answer most reasonable questions potential recruits might ask, and he moves his arms and body as he speaks, as a human sergeant might (if he were a stiff or sore sergeant). The overall effect is engaging; for instance, when asked something outside his set of available responses, Sgt. Star might dismiss the question, saying it’s more suited to an Internet dating site, and order a potential recruit on to the next query.
Still, Sgt. Star is incompletely persuasive as a virtual human, and ICT is aiming well beyond him, engaging in a wide array of basic and applied research and prototype design and production to create a blizzard of training applications and research prototypes that are increasingly realistic, flexible and immersive.
Sometimes that immersive quality is more a matter of concept than life-size graphic wow-factor, as with “Urban Sim,” a PC-based game in which Army trainees try to manage relations with the various factions in a digital version of an Iraqi city during a counterinsurgency campaign. (The current “Urban Sim” is based on actual experiences from Tal Afar, Iraq, that were then fictionalized, so the scenario was typical of the geographic region, without being specific to one place. Swartout says “Urban Sim” is now being adapted to support scenarios representative of Afghanistan.) Modeled roughly on the game “Sim City,” “Urban Sim” has teams of trainees decide how best to pursue counterinsurgency as the town’s various groups react in ways that are based on what commanders have actually experienced in Iraq, as translated to a computerized “deep social simulation” that incorporates the ways in which cultural groups and leaders in the city interact. As in Iraq, it is easy for a trainee’s actions to produce unintended negative consequences, and in the end, trainees are judged not by purely military objectives but by the level of support citizens show for military and civilian leaders.
Another insurgent-related offering is a mobile facility for training soldiers about improvised explosive devices; the final segment of the Mobile Counter-IED Interactive Trainer lets participants play the roles of both the soldier on the lookout for roadside explosives and the insurgent who plants them, in what Kim LeMasters, creative director for the institute, calls “a deeply immersive video game.” LeMasters says MCIT has only been in use since last summer, so there hasn’t been time to gather authoritative statistics on its effectiveness. But, he says, soldiers back from Iraq say, “I wish we’d had it before we went.”
The institute has also developed a treatment regime for post-traumatic stress disorder that uses virtual reality simulation to help returning soldiers re-experience some elements of their time in theater and, with the help of a therapist, process their emotions. Swartout says that in clinical trials, 75 percent of the patients who complete this set of simulations benefit (that is, are evaluated as “subclinical” in regard to PTSD symptoms afterward).
The ICT is working to make both humans and their environments look more lifelike; to involve more senses in the simulations so, as Swartout puts it, “it’s not just sight — it’s sound, it’s touch”; and to coordinate video simulations with actual objects, blurring the border in simulations between the virtual and real. The institute also wants to make virtual humans more widely useable; it even has a “toolkit” that provides researchers, free of charge, with the components and support needed to produce virtual humans. (Those who want to make virtual humans for a commercial project can, of course, license some of the ICT technology.)
But the institute is still mainly focused on efforts to train soldiers for the kinds of human interactions that require them not to use a gun but, as Swartout smilingly puts it, “to use their words.”
In a standard narrative — think of a novel or movie — a writer or director has control over the progress of the story. A reader or viewer can throw the book down or stalk out of the theater, but short of such dire action, he or she must follow the story line the writer lays out.
An interactive simulation populated by virtual humans, on the other hand, cannot come across as realistic if there is just one story line. In this type of simulation, virtual humans must be able to respond reasonably to a wide array of questions or actions the real humans throw at them. This need for a narrative that can “branch,” seemingly at any time, depending on how humans behave, is one of the leading challenges for the institute’s scientists and storytellers.
The challenge is illustrated in simple form by Sgt. Star, who, LeMasters notes, can serve up some 8,500 responses to questions asked by potential Army recruits. But, LeMasters says, when someone not familiar with the simulated sergeant is put in front of him, “within five questions that person has asked something not in the knowledge base.” For Sgt. Star and other relatively simple simulations, these off-subject inquiries are often handled with a humorous aside that doesn’t really answer the question but may (or may not) pull a human participant back into the simulation.
In other simulations involving virtual humans, LeMasters says, ICT researchers have tried to heighten drama to draw people into the interactive environment and keep them there. In one simulation, a game called “Gunslinger,” the human participant plays the role of “U.S. Ranger” in a bar from the Wild West. Virtual humans — including a bartender — are also in the bar, and they explain to the ranger that there’s an evil bandit in town who, they hope, the ranger will take care of. In early versions of the game, LeMasters says, most participants were Army personnel who would often take the revolver from their holster, point it at the virtual bartender’s head and order him to tell everything he knows about the town and the bandit, short-circuiting the story before it even started.
Having worked as a top entertainment executive for ABC (where he oversaw development of Happy Days, among other TV shows) and CBS (Lonesome Dove and Murphy Brown), LeMasters knows a bit about stories. Eventually, he says, the team involved with “Gunslinger” decided that to keep a human participant in the simulation, the virtual humans needed to raise the narrative tension in a way that limited options. So the virtual humans were programmed to display intense fear, immediately launching into an explanation of the bandit’s evil-doing in the town and creating a “pressurized situation” that would freeze human participants — at least for a time. Once the human participants were “loaded” with this background information, LeMasters says, they are much more amenable to listening to the virtual humans when they ask him if he needs help dealing with the bandit. (If he accepts the help, LeMasters says, the ranger becomes dependent on his virtual companions and very likely to follow their lead into the simulation; if he says no, the ranger winds up in a gun duel with the bandit that, the simulation assures, the ranger loses, every time.)
Other simulations use more flexible approaches to keep their humans involved. In one, a research prototype known as Stability and Support Operations — Enhanced Negotiation, Army participants negotiate with two virtual humans, an Iraqi doctor and elder who are responsible for a clinic located in a city market in Iraq. The trainee must convince them to move the clinic downtown, closer to an American base.
In this complex negotiating scenario, the virtual humans are life-size, and they respond to trainees (and one another) not just by talking, but with a full gamut of facial expressions and body movements, all of which are based on a decision-making process that is driven by the virtual humans’ internal “mental state,” which is determined through artificial-intelligence programming and includes many factors, including a calculation of emotion. So trainees must respond not just to the practical requirements of moving the clinic — it will need a source of electricity, for instance — but to the Iraqi virtual doctor’s fear that his clinic will be attacked if it is seen as too closely associated with the Americans.
To make this type of virtual scenario even more true to life and seemingly unpredictable, Swartout says, researchers intend to study ways for a human “director” (or, ultimately, an artificial-intelligence agent) to intervene when real humans start to behave in ways that stray beyond the scenario’s boundaries. In effect, this agent would shift the narrative, subtly redirecting trainees into a web of possible responses surrounding the general story line — without the trainees noticing that they have been influenced.
The progress being made in virtual humans and the simulations they inhabit seems likely to have a broad impact across society, and not necessarily in some distant future. In some cases, the impact could well be disruptive, in positive and negative senses of the word.
In the entertainment field, for example, a collaboration between the institute’s graphics lab and Image Metrics, a Santa Monica, Calif.-based firm that “provides superior facial animation solutions to the entertainment industry,” has produced Digital Emily, a computer-generated human so realistic, down to the level of skin pores, that it is all but impossible to tell apart from the real person on which it is modeled, Emily O’Brien, an actress in The Young and the Restless soap opera.
This technological fidelity to life already has some Hollywood actors in a tizzy. As Jeff Bridges, winner of the best actor Oscar this year, complained to the Los Angeles Times about the kind of “performance-capture” technology used in Avatar: “Actors will kind of be a thing of the past. We’ll be turned into combinations. A director will be able to say, ‘I want 60 percent Clooney; give me 10 percent Bridges; and throw some Charles Bronson in there.’ They’ll come up with a new guy who will look like nobody who has ever lived, and that person or thing will be huge.”
Advocates of the technologies that create human-virtual actor combinations, of course, see a brighter future. “Not only should [actors] not be afraid of it, they should be excited about it,” Avatar director James Cameron told the Times. “There is a new set of possibilities, after a century of doing movie acting in the same way.”
The spread of interactive immersions also will raise philosophical questions. Are people really making choices of their own free will when they are being carefully enticed through a virtual reality presentation? As virtual human technology spreads, we won’t all wind up living The Matrix. But is there much doubt that the ad-makers of Madison Avenue will be interested in immersing American consumers in simulations that very engagingly persuade them to empty their wallets by making what they think are their own choices?
There are also, no doubt, those who would question use of the country’s most advanced creative technology for military benefit, but on that score, I have no real qualms. War should be vigorously debated beforehand and avoided when possible, but once the country’s leaders decide — rightly or wrongly — to send Americans into battle, I want them to have access to every possible technological advantage, and to be as well trained as they can possibly be in the use of all available weaponry, including that most effective of weapons, the word.