Pattern recognition and machine learning are already being used in many specialty applications (spam classification, handwriting recognition, OCR, ad placement, etc.), but their potential for improving the way we use and interact with computers is far greater. Many functions and behavior are currently based on fixed, manually constructed rules that require programming skill, software engineering, and extensive testing in order to create and make reliable. Replacing this code with adaptive and trainable modules based on pattern recognition and machine learning holds the promise of reducing software development efforts, making software more robust, and solving real-world problems better.
While software engineering has advanced greatly over the last two decades, pattern recognition and machine learning code is still largely developed on a caseby- case basis by specialists. Furthermore, the methods for developing, training, and testing pattern recognition and machine learning modules differ greatly from those of other software systems since pattern recognition and machine learning methods are data-driven methods and often change behavior significantly for many input patterns in response to new training data, both properties not shared by normal software systems.
The goal of the PaREn project is to create the methods and tools necessary allowing non-experts to use, train, test, and deploy pattern recognition and machine learning modules in real-world software systems. The expected benefits are a far wider usage of pattern recognition and machine learning methods, leading to both better quality of the decisions and behaviors of software systems, as well as lower development costs.
Perhaps the biggest obstacle to the adoption and integration of pattern recognition and machine learning methods into real-world software systems is the mathematical complexity and sophistication required for adapting them to particular problems. This is not primarily a software engineering issue, it is a fundamental problem with the methods themselves: they usually have many parameters and their behavior is highly sensitive to how pattern recognition modules are interconnected. Furthermore, while many existing software components also may have many parameters that affect their behavior, those parameters generally correspond to concepts that are meaningful to application developers and can be optimized via trial and error. A major effort in the PaREn project will therefore be the development of new statistical methods and algorithms that are essential for reducing the number of parameters needed by pattern recognition and machine learning methods, automating parameter optimization, model selection, machine learning system construction, and supporting rapid testing, validation, and on-line adaptivity. Finally, we will be developing some new, collaborative environments and user interfaces for the development and training of pattern recognition systems that will be useful both for teaching pattern recognition, and for supporting the development and adaptation of pattern recognition systems in real-world environments.