Course Description
In this course on robotic process automation (RPA), students in any industry will learn to automate some of the simple, repetitive software tasks encountered by general office workers, managers, and information workers. They will learn to automate transaction processing, data manipulation, digital systems communication, and alerts that trigger a response requiring limited cognitive intelligence.
Students will discover the remarkable abilities of Intelligent Automation using AI agents and smart RPA with open source datasets and tasks, such as MiniWob++, WebShop, WebArena, and Mind2Web, a set of library environments of web-browser-based navigation and interaction tasks for computer control. We'll cover:
- Simple button clicking
- Complex form-filling
- Dragging actions
- Booking systems
- Email app navigation
Throughout the course, we’ll explore the intricacies of various MiniWob++, WebShop, WebArena, and Mind2Web tasks, observe how AI-agents work on these tasks, and analyze their performance in detail. We’ll highlight tasks where our agent excels and tasks where humans outperform our agent. While we investigate the challenges posed by specific tasks, such as Simon-says and terminal, we’ll shed light on the factors contributing to our agent's performance disparities compared to humans. Some of the intelligent open source agents we will learn in the class are: PIX2ACT, MINDACT, SEEACT, UFO, ProAgent, OpenAdapt, and CrewAI
We’ll also survey research and advancements in achieving human-level performance in smart and agentic RPA tasks; the strategies, techniques, and architectural choices that enable agents to achieve exceptional results; and uncover the challenges and opportunities in the field of RPA.
