This page gather some notes on Apache Con Europe 2007 which took place in Amsterdam on 1-4 May 2007. All in all, this was a very interesting yet exhausting event, the first such conference I have ever attended. I may regret that talks were a bit too informative rather than giving overall pictures and trend. What I was looking for was some insight into strategic designs choice, why software such and such was created, what tradeoffs were involved, why open-source was good for such piece of sotfware.  

Anyway, the most interesting part of such events is in the interaction with other people. While I am not very sociable kind of guy, I managed to have interesting discussion and exchanges with several people:  

20070507: ApacheCon Europe Day 4

This is the last day of ApacheCon Europe in Amsterdam. I had to leave earlier than official conference closure date due to train schedules.

Web frameworks comparison (Matt Raible)  

Matt Raible presented his particular point-of-view on Web frameworks. Compared:

Basic dichotomy of frameworks:

Struts heavily dominates the scene due to its antiquity. Will be there for a long time (unfortunately).

Developing with Roller and blogs (Dave Johnson)

Presentation of Roller, a professional blog web-application

ActiveMQ (Bruce Snyder)

Presentation of ActiveMQ, open-source Message Oriented Middleware.

20070503: ApacheCon Europe Day 3

Extending Ant : Steve Loughran, HP (??) Bristol

Presents several extension points for Ant beyond standard tasks extension:

  • scripts: inline/included script fragments that get interpreted by Ant
  • resources: define new data types for stream I/O  operations

Scripts

  • Provide easy way to extend/customize Ant behavior. Based on Java6 scripting framework (by default, uses Javascript but can use any BSF compatible script language).  
  • populate namespace with default values (self, attributes, project
  • testing through AntUnit: invokes tasks and provides assertXXX elements (can check against log output)

Tasks

Standard way: write a Java class file extending Task class. methods addXXX(T t) get injected automatically with type creation of nested elements.

Resources

Resources are what gets manipulated by Ant on the end: files, URLs, directories, streams, data...

Maybe used as input/output streams of data.

handle references in XML (ID/IDREF mechanism)

Dependency management with Ivy (Xavier Hanin)

Presentation of Ivy dependency management tool.

Text search: Beyond Lucene with Solr (Bertrand Delacrétaz)

Presentation of work done for Television Suisses Romande on VOD site with research and indexation. Application based on Lucene and Solr

Hadoop

Open-Source implementation of Map-Reduce engine.

see MapReduce by Ralf Lämmel

Apache DS (Emmanuel Lécharny)

ADS Opensource LDAPv3 compliant server and system  

Jackrabbit (Alex Popescu, InfoQ and Jukka Zitting)

Jackrabbit is Open Source content management system compliant with JCR (JSRO170) specification (Java Content Repository). Talk offers 2 point of views:  

Main features are:

20070503: ApacheCon Europe Day 2

Second Day in Amsterdam, with the real conference sessions starting.  

Steven Pemberton (CWI / W3C)  

Lively and high-level talk about abstraction level in programming. Draw from ABC and Python experience to show how high-level concepts emerged in programming to harness more power from the machine. Growth in computing power far outweighted growth in asbtraction power (ie. programming language abstraction level and ease of programming).

Main emphasis shall be on usability of systems (vs. learnability, eg. a violin is usable/fit for its usage, but definitely not learnable).

Distinction between sensitives (eg. common people)  and intuitive (eg. programmers): first one interact with the world through their senses, others do interact with an internal model of the world. Most programmers are intuitive, most users are sensitive, hence the usability problem. What's usable from a programmer's POV (eg. CLI) is not suitable to users.

Facts about SE :  

  • 90% of total cost of software is debugging (at large)
  • bug ratio is approximatively LOC^1.5, which is polynomial increase, not linear

Hence reducing code size reduces bug number: we need better abstractions to support increase in power.  

Funnily enough, never talked about functional programming during the session.

Example: the clock view over the time model-> MVC pattern applied to abstraction.  

ABC is ancestor for Python.

Web Security Trends

Awareness about security is increasing, yet there are more and more security problems:

  • number of attackers is also increasing
  • developers are not proactive enough in taking into account security
  • security lose battle against features development and delivery deadlines

XSS: Cross-site scripting

Basic principle:

  1. site provides a form where content can be input (eg. guest book)
  2. attacker provides dynamically interpreted content:
    • CSS attacks (eg. display manipulation, content loading through background: src ...)
    • phishing (replacing original sites content, eg. replacing divs)
    • javascript attacks (most common, may be very subtle)
  3. upon request answer, browser interprets content and triggers the attack (cookies/session stealing, redirected login input, CCard capture...)  

New threats without Javascript: java+javascript, Flex...

Response:

  • Validate every input character/string and escapes the rest  
  • use whitelisting to filter: unescape only explicitly allowed content

CSRF: Cross-Site Request Forgeries

Principle:

  1. user visit a compromised/attacker owned site
  2. reply contains javascript that  
  3. ...triggers request to a site where user is currently logged in (eg. make a bid on ebay, buy something, send some information...)

Difficult to defend against from client side:  

  • always logout explcitly  
  • do not trust unknown sites

From server side:

  • request relogging from user upon important actions
  • use number sequences to identify valid requests

SQL Injection

Principle::

  • use the knowledge that some sent data is part of SQL query to retrieve sensitive information from the server (passwords, session ids, identifiers, ccards ...)
  • used for DoS attacks

Ajax threats  

  • vulnerabilities in server-side APIs used to generate Javascript

Conclusion

  1. validates everything coming in
  2. escape everything going out

Strategic opportunities in OS - Rebecca Hansen, Sun

OSS is a structural change in IT industry, as profound as PC or Internet revolution.

3 forces driving the change:

  1. network: implies 0-cost distribution, distributed and open development  
  2. community: traditional markets are transactional, OS market is relational: win-win relationship, interdependency, each actor is a member of community (vendor, user, developer, tester...)
  3. unbeatable value: lower cost for good enough software can give an edge against competition (no licence, simpler to use, large user base)

3 strategic opportunities:

  1. rebalance power  
  2. share cost  
  3. new source of wealth

Rebalance power

eg: IE vs. Firefox, Office vs. OpenOffice

OSS can rebalance power in mature, locked markets where TCO is high and number of players is low. With enough momentum, OSS can become challenger very quickly: eg. database market (the strategy around Derby, Mysql, PostgreSql), have to be good enough

Howto:

Share

Problem: OEM technology, costs a lot, tied to a vendor without control, not strategic

Solution: use OS.  

If you got milk for free, do not buy the cow, feed it or share it.

New wealth source

Transactional market = lot of cost in marketing, two sources of revenue (selling licence, selling support)

Use community marketing instead of traditional marketing: peer-to-peer, relational means more trust, costs less money

Strategies:

  1. create revenue stream through support and service
  2. create value through differentiation (value-added product over OSS)
  3. complementary products  
  4. reach more people

No-nonsense introduction to Semantic Web - Stefano Mazzochi, MIT

Project Simile, various tools for aggregating sites data using RDF.  

RDF = provide metadata in web for tools to process them  

RDF = solution to interoperability problem at the WW scale  

Problems:  

  1. marketing ! Specification very complex and abstract
  2. serialization XML complicates thing, people got confused, rejected RDF
  3. technical: produces lot of disconnected graphs, no common vocabulary, may have inconsistencies  
  4. querying is somewhate difficult and non-uniform

Follow-up: Birds of Feather session 20pm-21pm

Lot of people interested in RDF, seems to become mature. Need of software, traction from Apache

Building RESTful services

Problem with POSTs:  

20070501: ApacheCon Europe Day 1

This is my first day in ApacheCon Europe 07, taking place in Amsterdam. This day was dedicated to the tutorials, of which I followed two of them:  

  • maven+archiva+continuum tutorial, by Brett Porter. Nice presentation of the maven ecosystem. Particularly interested in Archiva which acts both as proxy and local repository manager. I had a hard time starting the whole thing up, due maybe to some problem with my java configuration: the web application deadlocked in an endless loop under jdk 1.5.0_06. It worked ok with jdk from Java EE 5 which is jdk 1.6. Talk was a bit too short, with the technical part taking the greatest share of the time upon the software development process part. I would have liked to see more best practices, design choices and overall insight exposed.
  • wicket tutorial, by Mattej Koop. Wicket is a modern web application framework that tries to stay away as much as possible from XML, heavy-handed configuration files and anything not Java. All components are designed in Java and tied to HTML templates fragments with clear separation of logic and presentation. UI components draw data from Model objects that wrap around standard beans, allowing pluging in of field validation code. Maybe the code for these could have been made a little more dynamic so that type information could be used earlier in the validation process than it is now.