This page gather some notes on Apache Con Europe 2007 which took place in Amsterdam on 1-4 May 2007. All in all, this was a very interesting yet exhausting event, the first such conference I have ever attended. I may regret that talks were a bit too informative rather than giving overall pictures and trend. What I was looking for was some insight into strategic designs choice, why software such and such was created, what tradeoffs were involved, why open-source was good for such piece of sotfware.
Anyway, the most interesting part of such events is in the interaction with other people. While I am not very sociable kind of guy, I managed to have interesting discussion and exchanges with several people:
- Brett Porter : on maven and implementing extended reporting ability
- Lars Trieloff: on http://www.mindquarry.com/ and scala
- Aaron Farr on Scala
- Emmanuel Lécharny
- Garrick McFarlane
- Claude Brisson on Velosurf, litterature and philosophy...
- Dennis Lundberg on maven
20070507: ApacheCon Europe Day 4
This is the last day of ApacheCon Europe in Amsterdam. I had to leave earlier than official conference closure date due to train schedules.
Web frameworks comparison (Matt Raible)
Matt Raible presented his particular point-of-view on Web frameworks. Compared:
- Struts (1 and 2)
- JSF/MyFaces
- Spring
- Wicket
- Tapestry
Basic dichotomy of frameworks:
- component-based : application is designed as an assembly of UI components tied to pages with UI templating
- action-based : application is designed as a set of actions for request/responses handling
Struts heavily dominates the scene due to its antiquity. Will be there for a long time (unfortunately).
Developing with Roller and blogs (Dave Johnson)
Presentation of Roller, a professional blog web-application
ActiveMQ (Bruce Snyder)
Presentation of ActiveMQ, open-source Message Oriented Middleware.
20070503: ApacheCon Europe Day 3
Extending Ant : Steve Loughran, HP (??) Bristol
Presents several extension points for Ant beyond standard tasks extension:
- scripts: inline/included script fragments that get interpreted by Ant
- resources: define new data types for stream I/O operations
Scripts
- Provide easy way to extend/customize Ant behavior. Based on Java6 scripting framework (by default, uses Javascript but can use any BSF compatible script language).
- populate namespace with default values (
self, attributes, project
- testing through AntUnit: invokes tasks and provides
assertXXX elements (can check against log output)
Tasks
Standard way: write a Java class file extending Task class. methods addXXX(T t) get injected automatically with type creation of nested elements.
Resources
Resources are what gets manipulated by Ant on the end: files, URLs, directories, streams, data...
Maybe used as input/output streams of data.
handle references in XML (ID/IDREF mechanism)
Dependency management with Ivy (Xavier Hanin)
Presentation of Ivy dependency management tool.
- handles maven2 repositories and uses POM information for transitive dependencies resolution
- defines modules declaration (~extensible Maven2 classifiers) for customizing dependency resolution behavior depending on context. Repositories are handled through various protocols and layouts (layout is customizable using pattern-matching)
- dependencies may be published to repositories
- chains artifacts resolution through various resolvers (composite)
Text search: Beyond Lucene with Solr (Bertrand Delacrétaz)
Presentation of work done for Television Suisses Romande on VOD site with research and indexation. Application based on Lucene and Solr
- lucene provides basic text indexing and search algorithm
- solr provides front end and "easy-to-use" web application framework
Hadoop
Open-Source implementation of Map-Reduce engine.
- runs on commodity hardware (cluster/blades)
- uses a distributed file system (~Google File System). FS used is extensible.
- provides a task distribution/unification framework. JobTracker schedule jobs as close as possible to their data
see MapReduce by Ralf Lämmel
Apache DS (Emmanuel Lécharny)
ADS Opensource LDAPv3 compliant server and system
Jackrabbit (Alex Popescu, InfoQ and Jukka Zitting)
Jackrabbit is Open Source content management system compliant with JCR (JSRO170) specification (Java Content Repository). Talk offers 2 point of views:
Main features are:
- hierarchical namespace with properties attached to content
- session/workspace transactional system to isolate user operations from real backend
- provide various communication points and APIs
- may be used embedded or as standalone server
20070503: ApacheCon Europe Day 2
Second Day in Amsterdam, with the real conference sessions starting.
Steven Pemberton (CWI / W3C)
Lively and high-level talk about abstraction level in programming. Draw from ABC and Python experience to show how high-level concepts emerged in programming to harness more power from the machine. Growth in computing power far outweighted growth in asbtraction power (ie. programming language abstraction level and ease of programming).
Main emphasis shall be on usability of systems (vs. learnability, eg. a violin is usable/fit for its usage, but definitely not learnable).
Distinction between sensitives (eg. common people) and intuitive (eg. programmers): first one interact with the world through their senses, others do interact with an internal model of the world. Most programmers are intuitive, most users are sensitive, hence the usability problem. What's usable from a programmer's POV (eg. CLI) is not suitable to users.
Facts about SE :
- 90% of total cost of software is debugging (at large)
- bug ratio is approximatively LOC^1.5, which is polynomial increase, not linear
Hence reducing code size reduces bug number: we need better abstractions to support increase in power.
Funnily enough, never talked about functional programming during the session.
Example: the clock view over the time model-> MVC pattern applied to abstraction.
ABC is ancestor for Python.
Web Security Trends
Awareness about security is increasing, yet there are more and more security problems:
- number of attackers is also increasing
- developers are not proactive enough in taking into account security
- security lose battle against features development and delivery deadlines
XSS: Cross-site scripting
Basic principle:
- site provides a form where content can be input (eg. guest book)
- attacker provides dynamically interpreted content:
- CSS attacks (eg. display manipulation, content loading through background: src ...)
- phishing (replacing original sites content, eg. replacing divs)
- javascript attacks (most common, may be very subtle)
- upon request answer, browser interprets content and triggers the attack (cookies/session stealing, redirected login input, CCard capture...)
New threats without Javascript: java+javascript, Flex...
Response:
- Validate every input character/string and escapes the rest
- use whitelisting to filter: unescape only explicitly allowed content
CSRF: Cross-Site Request Forgeries
Principle:
- user visit a compromised/attacker owned site
- reply contains javascript that
- ...triggers request to a site where user is currently logged in (eg. make a bid on ebay, buy something, send some information...)
Difficult to defend against from client side:
- always logout explcitly
- do not trust unknown sites
From server side:
- request relogging from user upon important actions
- use number sequences to identify valid requests
SQL Injection
Principle::
- use the knowledge that some sent data is part of SQL query to retrieve sensitive information from the server (passwords, session ids, identifiers, ccards ...)
- used for DoS attacks
Ajax threats
- vulnerabilities in server-side APIs used to generate Javascript
Conclusion
- validates everything coming in
- escape everything going out
Strategic opportunities in OS - Rebecca Hansen, Sun
OSS is a structural change in IT industry, as profound as PC or Internet revolution.
3 forces driving the change:
- network: implies 0-cost distribution, distributed and open development
- community: traditional markets are transactional, OS market is relational: win-win relationship, interdependency, each actor is a member of community (vendor, user, developer, tester...)
- unbeatable value: lower cost for good enough software can give an edge against competition (no licence, simpler to use, large user base)
3 strategic opportunities:
- rebalance power
- share cost
- new source of wealth
Rebalance power
eg: IE vs. Firefox, Office vs. OpenOffice
OSS can rebalance power in mature, locked markets where TCO is high and number of players is low. With enough momentum, OSS can become challenger very quickly: eg. database market (the strategy around Derby, Mysql, PostgreSql), have to be good enough
Howto:
- become OS
- befriend OS
- use OS
Share
Problem: OEM technology, costs a lot, tied to a vendor without control, not strategic
Solution: use OS.
If you got milk for free, do not buy the cow, feed it or share it.
New wealth source
Transactional market = lot of cost in marketing, two sources of revenue (selling licence, selling support)
Use community marketing instead of traditional marketing: peer-to-peer, relational means more trust, costs less money
Strategies:
- create revenue stream through support and service
- create value through differentiation (value-added product over OSS)
- complementary products
- reach more people
No-nonsense introduction to Semantic Web - Stefano Mazzochi, MIT
Project Simile, various tools for aggregating sites data using RDF.
RDF = provide metadata in web for tools to process them
RDF = solution to interoperability problem at the WW scale
- unit of information = statement (triples subject --predicate-> object)
- each part of a statement is globally unique (use URIs)
- cope with mistakes/inconsistent world
Problems:
- marketing ! Specification very complex and abstract
- serialization XML complicates thing, people got confused, rejected RDF
- technical: produces lot of disconnected graphs, no common vocabulary, may have inconsistencies
- querying is somewhate difficult and non-uniform
Follow-up: Birds of Feather session 20pm-21pm
Lot of people interested in RDF, seems to become mature. Need of software, traction from Apache
Building RESTful services
- see Life beyond transactions (??)
Problem with POSTs:
- POST submission twice in a row for same data (not a problem with PUT or DELETE, Unique id provided, they are idempotent)
- scalability
20070501: ApacheCon Europe Day 1
This is my first day in ApacheCon Europe 07, taking place in Amsterdam. This day was dedicated to the tutorials, of which I followed two of them:
- maven+archiva+continuum tutorial, by Brett Porter. Nice presentation of the maven ecosystem. Particularly interested in Archiva which acts both as proxy and local repository manager. I had a hard time starting the whole thing up, due maybe to some problem with my java configuration: the web application deadlocked in an endless loop under jdk 1.5.0_06. It worked ok with jdk from Java EE 5 which is jdk 1.6. Talk was a bit too short, with the technical part taking the greatest share of the time upon the software development process part. I would have liked to see more best practices, design choices and overall insight exposed.
- wicket tutorial, by Mattej Koop. Wicket is a modern web application framework that tries to stay away as much as possible from XML, heavy-handed configuration files and anything not Java. All components are designed in Java and tied to HTML templates fragments with clear separation of logic and presentation. UI components draw data from Model objects that wrap around standard beans, allowing pluging in of field validation code. Maybe the code for these could have been made a little more dynamic so that type information could be used earlier in the validation process than it is now.