The ASF is made up of nearly 150 top level projects which cover a wide range of technologies. Chances are if you are looking for a rewarding experience in Open Source, you are going to find it here.
Apache SpamAssassin is an email filter used to identify spam
Apache SpamAssassin is an extensible email filter which is used to identify spam. Using its rule base, it uses a wide range of advanced heuristic and statistical analysis tests on mail headers and body text to identify "spam", also known as unsolicited bulk email. Once identified, the mail can then be optionally tagged as spam for later filtering. It provides a command line tool to perform filtering, a client-server system to filter large volumes of mail, and Mail::SpamAssassin, a set of Perl modules.
Apache XML Commons External provides an Apache-hosted set of SAX, DOM and JAXP interfaces for use in other xml-based projects.
The External components portion of Apache XML Commons contains interfaces that are defined by external standards organizations. For DOM, that's the W3C; for SAX it's David Megginson (http://www.saxproject.org); for JAXP it's Sun. While we could send users to each of the primary sources for these deliverables, keeping our own versions of these in the XML Commons repository gives us a number of advantages: 1) Simplicity of downloads; users get the whole product from one place, 2) Better version control; we can only take fixes we want and add Apache-specific changes, 3) Better overview documentation of how these interfaces fit into...
Apache Giraph is an iterative graph processing system built for high scalability.
Apache Giraph is an iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.