In an age of ever increasing information collection and the need to evaluate it, building systems which utilize the yet untapped and available compute resources in everyone’s home and hands  should be driving the development of more sophisticated distributed computing systems. Today, large data processing facilities provide significant compute capabilities. Utilizing the worldwide plethora of distributed resources in a coherent way is much more powerful.

Distributed programming and processing tools and techniques are currently a reality but are in their infancy. Potential rapid growth of distributed systems is already supported by:

  • Storage, bandwidth and CPUs staying on course to becoming nearly free. (Free. The Future of a Radical Price)
  • The number of people and devices connected to the internet continually grows.
  • Data storage requirements increase as data accumulation from all sources grows as does the number of sources.

It is becoming more  common to see Terrabyte storage devices in homes. Desktop and laptop appliances have become somewhat of a commodity affordable to your average consumer. You can stake out claim to a table for a period of time at your local coffee shop and access the internet for free. Becoming a social citizen on the internet with portable compute resource was once cost prohibitive and is now plummeting to a price affordable to a significant portion of the population.

Distribution of affordable, cheap and free compute devices to the general public continues to grow. Most of the resources sit idle much of the time. Game consoles, cell phones, laptops, desktops, etc. can now all participate in the storage and processing of data.

Like it or not, the ability to capture and share data is becoming increasingly easy. You can watch your favorite gorilla in the jungle or using collective intelligence to extract social and individual’s patterns with service apis provided by large corporations like Google and Amazon. Today’s transient data coming from sources in real time will eventually be stored. Much of the data is and will be captured and stored in perpetuity at corporate web sites and in data centers. Some of what should be available may be accessible through the gates of these data centers.

The approach to managing and controlling processing remains focused on huge data centers. In this sense, social and engineering thought is still akin to 19th century practice of building monolithic systems with centralized control. As data generation increases and the cost of storage decreases, huge data centers are being built to house and process data. Google, Apple, Codera and NTT America to name just a few. What will they do with all this data and how much will be shared?

IBM announced its plans to build a petaflop machine for the SKA telescope program. It is a laudable and beneficial effort. Undoubtedly, research and lessons learned from the effort will be valuable. But efforts should be made to build distributed systems of equal or greater benefit. Efforts such as BOINC provide a rudimentary effective start. File sharing peers using DHT have already demonstrated power and influence. Both illustrate cost effective usage of existing distributed compute resources where most data is accessible to everyone.

Distributed Computing is in its infancy (I’m not referring to Cloud Computing). A number of technologies supporting distributed computing have been developed. Some have survived and some waned. A sophisticated distributed system is on par with the importance of nanotechnologies and artificial intelligence. It will support those other technologies as well. It has the potential to distribute energy needs for processing rather than requiring a power plant dedicated to running a data center. It has the potential to distribute data storage so it’s never lost and provides a means for individuals to control their own personal information. It has the potential to provide mechanisms which capture data in real time and process as needed where needed with most efficient usage of resources. In so doing, mirroring the real world (ala Gelernter’s Mirror World).

So although building data center citadels and powerful HPC computers is valuable so is developing and building sophisticated  distributed computing systems. In fact, it’s likely much more important.

Reblog this post [with Zemanta]

The 2009 TED conference was this week. This is its 25th year although the first time I have heard of it thanks to Twitter mostly and those twittering the experience. I didn’t attend but monitored the activities and sessions some. It is a gathering and sharing of great minds, their visions, aspirations and creations in both science and art.  I hope to be able to attend in person at some point.

With all the talk and demos about technological advances and the need to capture, mine and process the vast amounts of electronic data produced, I’m surprised there was no mention of harnessing the compute power available in phones, desktops, clouds, supercomputers, and all devices everywhere. Also, how that might be done. Maybe I missed it (since I wasn’t there) or maybe it wasn’t the proper forum for that kind of discussion, but it seems to me that it was glaringly missing.

If anyone knows about such discussions taking place at the conference in sessions or even breakout groups, I am interested in finding out about them.

There are a number of sites with information about installing Puppet on Solaris. They each contain slightly different instructions which get you most of the way there. With a little finesse it’s not hard to follow the instructions and get things working. This post includes yet another set of instructions for installing Puppet and getting things running. Hopefully with these instructions and others as reference your installation goes smoothly.

For those who are unfamiliar with Puppet, it is a tool for automating system administration. It is built and supported by Reductive Labs. They describe Puppet as  a declarative language for expressing system configuration, a client and server for distributing it, and a library for realizing the configuration. Rather than a system administrator having to follow procedures, run scripts and configure things by hand, Puppet enables defining a configuration and automatically applies it to specified servers and then maintains it. Puppet can be downloaded for many of the most popular operating systems. There is a download page with links to some installation instructions.

Installation on Solaris

1. To make installation more automated, install the Solaris package pkg-get. This tool simplifies getting the latest version of packages from a known site. A copy can be found at  Blastwave.

download http://www.blastwave.org/pkg_get.pkg to /tmp
Make sure the installation is done with root privilege. su to root.
run the following command from the /tmp directory

# pkgadd -d pkg_get.pkg

The package can also be added using the following command

#pkgadd -d http://www.opencsw.org/pkg_get.pkg

2) Verify that the pkg-get configuration file is configured for your region. In this case in the U.S. Change the default download site in the configuration file /opt/csw/etc/pkg-get.conf to:

url=http://www.ibiblio.org/pub/packages/solaris/opencsw/stable

or

url=http://www.ibiblio.org/pub/packages/solaris/opencsw/current

3) Add some new directories to your path.  pkg-get, wget and gpg are installed in /opt/csw/bin.

# export PATH=/opt/csw/bin:/opt/csw/sbin:/usr/local/bin:$PATH

4) Install the complete wget package. wget is a tool GNU tool used to download and install packages from the web. This is a very useful tool to automate installs and software updates. This tool will be used by pkg-get.

# pkg-get -i wget

Note:

If you haven’t installed the entire Solaris OS, the pkg-get may fail to install wget, with the error:

“no working version of wget found, in PATH”

This is probably due to missing  SUNWwgetr and SUNWwgetu packages. Install them by inserting an installation DVD disk into the DVDROM and mount it to /media/xxxx

Install the Solaris packages

# pkgadd -d . SUNWwgetr
# pkgadd -d . SUNWwgetu

5) Configure pkg-get to support automation.

# cp -p /var/pkg-get/admin-fullauto /var/pkg-get/admin

6) Install gnupg and an md5 utility so security validation of Blastwave packages can be done.

# pkg-get -i gnupg textutils

You may also need to define $LD_LIBRARY_PATH to /usr/sfw/lib to find needed libraries.

7) Copy the Blastwave PGP public key to the local host.

# wget –output-document=pgp.key http://www.blastwave.org/mirrors.html

8) Import pgp key

# gpg –import pgp.key

9) Verify that the following two lines in /opt/csw/etc/pkg-get.conf are COMMENTED OUT.

#use_gpg=false
#use_md5=false

10) Puppet is build with Ruby. Install the Ruby software (CSWruby) from Blastwave.

# pkg-get -i ruby

11) Install the Ruby Gems software (CSWrubygems) from Blastwave.

# pkg-get -i rubygems

12) Update to the latest versions and install a the gems used by Puppet

# gem update –system

# gem install facter

# gem install puppet –version ‘0.24.7′

or current version. The gem update command can also be used to update the software.

# gem update puppet

13) Create the puppet user and group:

Info to add in /etc/passwd: puppet:x:35001:35001:puppet user:/home/puppet:/bin/sh
Info to add in /etc/shadow: puppet:LK:::::::
Info to add in /etc/group: puppet::35001:

14) Create the following core directories and set the permissions:

# mkdir -p /sysprov/dist/apps /sysprov/runtime/puppet/prod/puppet/master
# chown -R puppet:puppet /sysprov/dist /sysprov/runtime

15) add puppet configuration definitions in /etc/puppet/puppet.conf. The initial content using your own puppetmaster hostname should be:

[puppetd]
server = myserver.mycompany.com
report = true

16) Repeat this process for the servers which will run Puppet. At least 2 should be set up. One will be the Master Puppet server, the other a Puppet client server that will be managed.

Validating the Installation and Configuring Secure Connections

To verify that the Puppet installation is working as expected, pick a single client to used as a testbed. With Puppet installed on that machine, run a single client against the central server to verify that everything is working correctly.

Start the master puppet daemon on the server defined in puppet.conf files.

# puppetmasterd –debug

Start the first client in verbose mode, with the –waitforcert flag enabled. The default server name for puppetd is Puppet. Use the server flag and define the server name running puppetmasterd. Later the server hostname can be added to the configuration file.

# puppetd –server myserver.mycompany.com –waitforcert 60 –test
Adding the –test flag causes puppetd to stay in the foreground, print extra output, only run once and then exit, and to just exit if the remote configuration fails to compile (by default, puppetd will use a cached configuration if there is a problem with the remote manifests).
Running the client should produce a message like:

info: Requesting certificate
warning: peer certificate won’t be verified in this SSL session
notice: Did not receive certificate

This message will repeat every 60 seconds with the above command. This is normal, since your server is not initially set up to auto-sign certificates as a security precaution. On your server running puppetmasterd, list the waiting certificates:

# puppetca –list

You should see the name of the test client. Now go ahead and sign the certificate:

# puppetca –sign myserver.mycompany.com

The test client should receive its certificate from the server, receive its configuration, apply it locally, and exit normally.
By default, puppetd runs with a waitforcert of five minutes; set the value to the desired number of seconds or to 0 to disable it entirely.

Getting this far, you now have puppet installed with a base initial configuration and secure connections defined between a Puppet master server and one puppet client server. At this point you can start defining manifests for desired server configurations.

There are various sample recipes and manifest to start working with. Viewing and editing some of thes is a good place to start learning how to create configuration defintions. If there is interest I can share sample as well if I have one that may be useful for your needs.

Typically ‘Change in Government’ is an oxymoron. It has been a mantra in the U.S. for past months. I’m all for changing things for the benefit of everyone. I’m positive about things becoming better in the next 4-8 years. I would like to be part of the positive changes. But claiming that Cloud Computing will be a positive change in how people work with the government is just plain marketing hype and jumping on the technology bandwagon.

This is just another example of a perceived (marketed) new technology that will solve our problems. Don’t hold your breath. Things don’t Change overnight. Cloud Computing is not a panacea for computing problems. It is an architecture perspective that can be leveraged for specific computing needs. There are a number of good application implementations using this architecture approach. However it’s not appropriate to force everything into this box.

In the case of the government and many corporations, existing applications can’t be easily ported over to Cloud Computing. Application software needs to be re-engineered or built from scratch to take best advantage of the cloud model.  There are small pockets of innovation in government and large industry. It’s typically sequestered off in a corner, has a hard time making headway and gets quickly surpassed by more agile small companies. This has been the scenario with most all new technologies and ideas.

Will the government’s use of Cloud Compuing introduce change? Hmm, maybe. Certainly not though if there is no substance behind the marketing hype.  And certainly not if they try implementing this from within the oragnization rather than using innovative startups. For anyone else considering Cloud Computing, make sure you understand what it is and why you want to use it. Then make sure you and others will benefit from your use of it. Otherwise your just spreading the hype and jumping on the bandwagon.

Most distributed computing environments consist of a set of known devices in well known configurations. The devices are usually cataloged somewhere in a database by a person. If the state of the environment changes by a server being removed, replaced or taken offline, the database must be updated. This kind of environment I refer to as Statically Defined and inflexible.

This is a feasible way of accounting for a small number of machines in a network. However as data centers grow in dimensions to 100s or thousands of servers and as all computing devices begin to randomly come and go, keeping track of changes must become dynamic.

Assumptions in software architecture are typically made with the expectation that there will always be access to a desired server at a known location. To handle a worst case scenario it’s typical to configure clusters of servers  in case one fails and the software architect can ignore failures.  This only marginally protects against failures. More realistic designs must account for the fact that most anything can fail and any given moment. The architects philosophy should be that all things will fail and all failures should be handled.

Consider an architecture that represents a system that is hard to shut down rather than one representing handling a few failure scenarios. One such architecture represents the peer-to-peer (p2p) file sharing systems executing across the internet. From the perspective of any client, the system is always running and available. As long as the client has access to the internet, accessing shared files is almost always possible.

Core to p2p architecture is a network overlay using distributed hash table algorithms to manage mappings of hosts across the internet which dynamically join and leave.  Add to this

  • a mechanism to determine the attributes of the server such as hardware, OS, storage capacity, etc.,
  • software deployment and installation capabilities  at each host,
  • an algorithm to match services to a host that is best suited for executing the service
  • monitoring capabilities to insure services are executing to defined SLAs

Then you have an architecture that dynamically scales and maintains itself. Assimilator is one of the few systems that is capable of doing this today.

Some Cloud Computing vendors claim massive scaling capabilities. This of course assumes the vendor has many thousands of server and that clients have statically defined usage of servers in advance. True massive scaling will come with resources that are allocated automatically and managed dynamically without human intervention.

Reblog this post [with Zemanta]

In searching for some needed codecs for QuickTime to play avi files ripped from DVDs  on my Macbook Pro, I ran across Perian ‘The swiss-army knife for QuickTime’. Perian includes video and audio codecs for many common formats.

It’s simple to use! Download the disk image, open it, double click the Perian.prefPane icon and all the components are automatically installed in the correct directories. All the video and audio plays properly for me.

Hats off to the Perian Project Team for a great job!

Overview

There are plenty of examples describing how to build applications from existing database content using various tools and techniques. The assumption in the examples is that either the developer must use a legacy database or that the application is being built from the perspective of data modeling with an existing database. Some examples include:

But what if you want to build the logical data model defining classes first and you don’t have a pre-built database? You want to define only the object model with attributes and operations that make sense for the application in terms of business requirements and not have to be concerned with database entities and the corresponding relationships. Also you need serialized objects that can be easily passed between servers and clients. This article describes a way to define the logical business model first and dynamically create the corresponding schema, database and scripts which can be used to manage the database. It also briefly covers using inheritance strategies which can be applied in the creation of a database.

This example was built with

  • Netbeans 6.5
  • Java 6

The project files are located at my open source Assimilator project site.

The Model

A simple model for  cycling teams is constructed in this example.  Represented are teams, team members and an inheritance structure for different types a bicycles that a team member may have. The UML class diagram for the logical object model looks like.

Cycling Teams Class Diagram

Cycling Teams Class Diagram

Each team has a set of members, each member is designated a specific type and each member can have a set of associated bicycles. Bicycle is an abstract base class and there are three concrete classes which extend Bicycle.

Defining the Database Connection

Create a new database for the cycling team information …

  1. In Netbeans, click on the Services tab,
  2. right-click on JavaDB and select Create Database…
  3. fill in the dialog with the following information (the database location will be in the default Netbeans install directories)
  4. click OK
Create Java DB

Create Java DB

  1. right-click on Databases
  2. select create Database connection
  3. fill in the dialog with the values defined using the user name and password assigned while creating the database
  4. add ‘create=true’ in the Additional Props field
  5. select OK which makes the Advanced settings visible
  6. enter CYCLING for the schema type
  7. select OK
Database Connection

Database Connection

The new connection should show up in the connection list.

Creating the Netbeans Project

Project Creation

A compressed file containing all the Netbeans project files is include with this post. Uncompress the file and open the project with Netbeans using File->Open Project and open the uncompress project file.

Alternatively you create a new project and add in the files from the included archive. To create a new projects

  1. select File->New Project…
  2. select Java category and Java Application Project type
  3. select the Next button
  4. enter CyclingTeam for the project
  5. define the directory location to store the project files
  6. deselect Create Main Class and Set as Main Project
  7. select the Finish Button

newproject

newjavaapp

Libraries

To compile and run the code Hibernate and JavaDB libraries need to be included in the project settings. These libraries are included with the sample project file.

  1. right-click on the project icon and select Properties
  2. select Libraries from in the Categories panel and
  3. select the Compile tab
  4. click the Add Jar/Folder… button and add all the jar files from the hibernate-support directory and the javaee.jar
  5. select relative path
  6. Click the Choose button
  7. add the same libraries under the Run Tests tab
  8. add the JUnit 4.5 library under the Compile Test tab
  9. click OK on the properties dialog box
Project Library Settings

Project Library Settings

Coding

Implementing the Business Model

All the code for this project is included in the project file. There are a few things to note about the source code. Following the logical model, first create the the CyclingTeams class. Creating this class using the Netbeans wizards is an easy way to be guided through the creation of persistence unit file.

  1. right-click the project CyclingTeams Source Package
  2. select new entity class
  3. enter CyclingTeam for the classname
  4. com.sample.cycling for the package name
  5. Long for the Primary key type
  6. click on the Create Persistence Unit button
  7. in the new dialog box enter CyclingTeamsPU for the Persistence Unit Name
  8. select Hibernate for the Persistence Provider
  9. on the Data Source pull-down menu select New Data Source
  10. enter CyclingTeams for the JNDI name
  11. select the CyclingTeams database connection
  12. click OK
  13. click Create
  14. click Finish

This example uses EJB3 annotations. Every class defined in this example corresponds to an entity so is tagged with the @Entity annotation at the beginning of the class definition. Also,  classes have an @Id annotation defined on the variable to be used as the database key value. In this example, the strategy for generating the value is Auto which is the default value.

In the CyclingTeam class, in addition to the code generated by the wizard, add the following code:

@OneToMany(cascade = CascadeType.ALL, fetch=FetchType.EAGER)
    Collection<TeamMember> teamMembers;

    public Collection<TeamMember> getTeamMembers() {
        return teamMembers;
    }

    public void setTeamMembers(Collection<TeamMember> teamMembers) {
        this.teamMembers = teamMembers;

The class contains a Collection of TeamMembers. The OneToMany annotation defines a many-valued association with one-to-many multiplicity to be used in the database schema. The cascade type defines which  operations that must be cascaded to the target of the association. In this case, all operations. Fetch defines that the association must be eagerly fetched rather than the default lazily loaded. The corresponding getter and setter methods are defined. The TeamMember class also has a collection of Bicycles defined in the same way with annotations.

Notice that an @Id is defined in the Bicycle abstract base class but not in the other concrete bicycle classes. The key value only needs to be defined in the base class.

Note. At the time of writing, the Hibernate JPA persistence provider has some known issues that can cause problems. While not specific to NetBeans, these issues can be worked around by choosing eager as the association fetch value [@OneToMany(cascade = CascadeType.ALL, fetch=FetchType.EAGER] and java.util.Set as the collection type. When using the Hibernate JPA persistence provider and choosing default or lazy as the association fetch value for mapping, an org.hibernate.LazyInitializationException exception can occur at runtime. If you choose an association fetch value of eager and you do not also choose java.util.Set as the collection type, an org.hibernate.HibernateException can occur (with the message “cannot simultaneously fetch multiple bags”) at runtime. These issues pertain to the Hibernate JPA persistence provider and not to the default persistence provider.

Persistence Unit

The persistence.xml file generated earlier needs to have all the entity classes added. Open the file which is in the Configuration Files folder.

  1. select the Design tab
  2. select Add Classes button
  3. select all the classes listed in the Add Entity Class dialog box
  4. select OK

To add the rest of the Hibernate properties, select the XML tab and copy in the properties listed below. The entire persistence.xml file contents should looks like the following:

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="1.0" xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd">
  <persistence-unit name="CyclingTeamPU" transaction-type="RESOURCE_LOCAL">
    <provider>org.hibernate.ejb.HibernatePersistence</provider>
    <class>com.sample.cycling.Bicycle</class>
    <class>com.sample.cycling.CyclingTeam</class>
    <class>com.sample.cycling.RoadBicycle</class>
    <class>com.sample.cycling.TeamMember</class>
    <class>com.sample.cycling.TimeTrialBicycle</class>
    <class>com.sample.cycling.TrackBicycle</class>
    <properties>
      <property name="hibernate.cache.provider_class" value="org.hibernate.cache.NoCacheProvider"/>
      <property name="hibernate.dialect" value="org.hibernate.dialect.DerbyDialect"/>
      <property name="hibernate.show_sql" value="false"/>
      <property name="hibernate.sequence" value="cycling_teams_sequence"/>
      <property name="hibernate.connection.url" value="jdbc:derby://localhost:1527/CyclingTeams;create=true"/>
      <property name="hibernate.connection.driver_class" value="org.apache.derby.jdbc.ClientDriver"/>
      <property name="hibernate.connection.password" value="rocks"/>
      <property name="hibernate.connection.username" value="cycling"/>
      <property name="hibernate.hbm2ddl.auto" value="create-drop"/>
    </properties>
  </persistence-unit>
</persistence>

JPA Controllers

A set of classes which control access to the entities, manage the database connections and provide the base logic for correctly mapping the relationships within the business classes can be  automatically generated.

  1. right-click the CyclingTeam-ejb project icon
  2. select New->Generated JPA Controller Classes from Entity Classes…
  3. select Add All to include all the entity classes
  4. select Next
  5. set the package name to com.sample.cycling.jpacontrollers
  6. select Finish

Controllers define the JPA access through create, edit, destroy, and find methods. The controllers define the entity managers to use and manage the transactions. There are controllers for each entity type. Each controller manages the associations between entities as well. For example, if a CyclingTeam has a collection of associated TeamMembers  and the CyclingTeam is deleted from the database, the association references in the database are also cleaned up.

Additional code may need to be added to the controllers to help manage the objects in the model. Im this example a check for null Collection fields is added to the default generated code. In CyclingTeamJpaController the follwoing code was added to the create method

if (cyclingTeam.getTeamMembers() == null) {
            cyclingTeam.setTeamMembers(new HashSet<TeamMember>());
 }

Similar code is added to the create method in TeamMember to check for a null Bicycles collection.

Testing

Creating unit test is a useful way to exercise the code to understand how it works and to identify errors.

  1. right-click on the CyclingTeamJpaCopntroller class
  2. select Tools -> Create JUnit Tests…
  3. use the default settings making sure the Location is set to  Test Packages
  4. click ok

A default unit test class is created with code generated to exercise all the methods in the CyclingTeamJpaController class. By default all the test are set to fail. They must be edited to perform the desired tests. The project file included with this post has a few edited unit test classes. The tests provide a start to understand how the JPAControllers can be called by web services or EJBs.

To run the unit test, right-click in the edit window of the class and select Run File.  The unit test causes the database schema to be created and test records to be inserted, read and edited.

To view the contents in the database, comment out the code in the teardown method of the CyclingTeamJpaController class and run the tests. Select the Services tab and open the jdbc connection for CyclingTeams. Right click on the Tables node and select Refresh. The tables created in the database should now be visible. Right-click on the CYCLINGTEAM table and select View Data. This will subit an SQL query to show all the records in the table.

Creating DDL

Using the hibernate tools, ddl sql scripting can be generated to create all the tables and relationships for the business model and to clean up the database with drop statement for all the generated tables. In the project build.xml file that is located in the top level directory of the project, add the following script to the end of the file.

<target name="-post-jar">
        <path id="hibernate.tools.lib">
            <pathelement path="${file.reference.antlr-2.7.6.jar}"/>
            <pathelement path="${file.reference.asm-attrs.jar}"/>
            <pathelement path="${file.reference.asm.jar}"/>
            <pathelement path="${file.reference.cglib-2.1.3.jar}"/>
            <pathelement path="${file.reference.commons-collections-2.1.1.jar}"/>
            <pathelement path="${file.reference.commons-logging-1.1.jar}"/>
            <pathelement path="${file.reference.dom4j-1.6.1.jar}"/>
            <pathelement path="${file.reference.ehcache-1.2.3.jar}"/>
            <pathelement path="${file.reference.hibernate-annotations.jar}"/>
            <pathelement path="${file.reference.hibernate-commons-annotations.jar}"/>
            <pathelement path="${file.reference.hibernate-entitymanager.jar}"/>
            <pathelement path="${file.reference.hibernate-tools.jar}"/>
            <pathelement path="${file.reference.hibernate3.jar}"/>
            <pathelement path="${file.reference.javassist.jar}"/>
            <pathelement path="${file.reference.jdbc2_0-stdext.jar}"/>
            <pathelement path="${file.reference.jta.jar}"/>
            <pathelement path="${file.reference.javaee.jar}"/>
            <pathelement path="${file.reference.freemarker.jar}"/>
            <pathelement path="${dist.jar}"/>
        </path>

        <taskdef name="hibernatetool" classname="org.hibernate.tool.ant.HibernateToolTask" classpathref="hibernate.tools.lib" />
        <hibernatetool destdir="${dist.dir}">
            <jpaconfiguration persistenceunit="CyclingTeamPU"/>
            <classpath>
                <path location="${dist.jar}"/>
            </classpath>

            <!-- Create SQL script -->
            <hbm2ddl outputfilename="cycling_teams_create.sql" export="false" format="true"/>

            <!-- Drop SQL script -->
            <hbm2ddl outputfilename="cycling_teams_drop.sql" export="false" create="false" drop="true" format="true"/>
        </hibernatetool>

    </target>

The target is executed after the project jar file is successfully created. Two scripts are generated in the ./dist directory.

cycling_teams_create.sql to create the tables,

cycling_teams_drop.sql to cleanup the tables.

These scripts are generated each time a Clean and Build is run on the project.

Summary

Projects requiring storing data in a database can be developed by creating the data model and database first and coding to the structure, or in this example a business object model can be created first and used to automatically generate the schema and database tables. Team skills and familiarity with various tools tend to dictate which approach is used.

For new projects, if you use the method described in this example, you’ll  see how development can rapidly cycle through business object model development and directly map changes to a database. Many development cycles can be done to accomodate rapid changes until the model and database structure have been normalized. The result should be a coherent object model easy to develop with serilaized objects easy to pass around  and a direct mapping to a database taking advantage of the Hibernate and EJB3 tools to help manage the data for you.

This project has been busy developing software for semantic relationships on your desktop machine. Technology Review published an article about the release of the software. There is a free version for Windows, Mac and Linux. More information about the Nepomuuk software is on their wiki site.

I will be trying out this software and will follow up with my thoughts. If you do as well, please feel free to post comments about it here.

Lately there has been publicity about how major corporate Cloud Computing offerings are really just a play to lock you into vendor specific solutions while they collect information about you and your customers.

Richard Stallman says cloud computing is ’stupidity’ that ultimately will result in vendor lock-in and escalating costs.

Oracle’s Larry Ellison says cloud computing is basically more of the same of what we already do. I think he is saying they will continue business as usual and jump on the band wagon and use the term.

Tim Bray blogged about cloud computing vendor lock-in being defined as  “deploying my app on Vendor X’s platform, there have to be other vendors Y and Z such that I can pull my app and its data off X and it’ll all run with minimal tweaks on either Y or Z.

Even Steve Ballmer seems to be anti cloud computing, citing that consumers don’t want it. I’m not sure I understand his argument other than essentially saying it requires some proprietary software to run in the context of someones cloud.

Tim O’Reilly wrote an excellent blog on Open Source and Cloud Computing. He provides this bit of laudable advice. “if you care about open source for the cloud, build on services that are designed to be federated rather than centralized. Architecture trumps licensing any time.”

While some of the paranoia about being lock-in and vulnerable to a corporation is warranted, there is also an undercurrent of revulsion to its marketing. This stems from the fact that the term ‘cloud computing’ has already achieved a high silliness factor in its use to brand everything (à la 2.0). Also, this computing model is not yet sorted out and should evolve into something better with input and guidance from those who are technology savvy.

A well constructed architecture for a distributed execution platform will provide a truly open and scalable solution for clouds and distributed computing in general. By Distributed Execution Platform I mean primarily a platform which can among other things:

  • dynamically discover resources on a network
  • enable dynamic software provisioning of software services where execution is most efficient
  • manage services as needs dynamically change
  • detect failures and automatically reconfigure itself to accommodate

A non-proprietary platform with these types of capabilities must be architected to execute in data centers and across individual computers connected to the internet as well as out to the edge. UIs access the remote services and data independent of where they may be running much like browsers accessing web sites. Distributed data is accessible across peers in a p2p model and accessible to all services. Data is accessed using common apis and through use of proprietary interfaces used by collections of collaborating services.

The payoff for using services executing in a open distributed execution platform will be for

  • small companies needing to exist on strict budgets,
  • individual developers looking to create the next killer application and
  • large corporations who run virtualized services (for free or fee paid) in their own data centers

The above characteristics enable large corporations and individuals to compete basically on the same playing field.

Note: I was hoping to coin the phrase ‘Distributed Execution Platform’.  But, it has been used periodically elsewhere. Most commonly at the moment by the Dryad project. Maybe it will become the next overused buzzword.

This is a follow-up to one of my previous posts “Scrum not just for developers”.  In the past couple of weeks in discussions about software development, a number of people I have spoken with have indicated they believe Agile development applies only to tasks engineers perform. It’s not that these people are opposed to having managers, stakeholders, and users involved in the development process, but that the process doesn’t apply to them. This is an absolutely incorrect assumption.

This misconception comes primarily from

  • not being familiar with Agile Methodologies, and
  • not knowing how or when this communication should take place.

Most typically missing from the process is communication with stakeholders and end-users. A typical anti-pattern that arises is a development team drifting away from interaction with stakeholders and users except at pre-defined scheduled meetings which are spaced too far apart. Stakeholders and users are critical for defining the desired functional capabilities of the system being built throughout each development cycle. Capabilities are refined in short development cycles and new requirements arise which must be addressed as soon as possible.

It’s true some methodologies focus more on the programming techniques. For example, XP focuses on Pair Programming, Test Driven Design and Refactoring. But even XP is dependent on a methodology driven by communication with all team members. Among other places, this is referenced in

It can’t be made important enough that no matter what agile methodology your team applies, the communications and involvement of All project team members is critical to project success. A simple overview of Agile Development principles can be found at Manifesto for Agile Software Development. Where one of the key principles is

” Business people and developers must work together daily throughout the project.”

Even better, I would include End Users or people who represent end users for constant valuable feedback throughout the development of the project.