JPA 2.1 Schema Generation

I’ve used Hibernate as a persistence layer for years. As any of you who have used it probably know, it was the basis for the Java Persistence API, the “portable” version of the Java Object Relational Mapping (ORM) API.

Until the advent of version 2.1 of the JPA specification, one of the items that still required vendor-specific code or properties was the automated generation of database schemas. JPA 2.1 standardized that as well. I’ve seen a number of articles on the web that describe this, but none of them do much more than recite the relevant properties. So, Gentle Reader, I’ve decided to take you on a small tour of what you can do with the new standard interface.

Code for this article is available on GitHub.

Background

First, a little background. The project is built using Hibernate 5.0.1, and uses an embedded version the H2 database, since that was easy to bring in via Maven, and has no external dependencies. As example entities, I have a “user” class:

@Entity
public class User
{
	@Id
	@GeneratedValue
	@Column(name = "id")
	private Integer id;
	
	@Column(name = "version")
	private Integer version;
	
	@Column(name = "name", nullable = false)
	private String name;
	
	@OneToMany(mappedBy="user")
	private List<Comment> comments;
	
	public User()
	{
	}
	
	public Integer getId()
	{
		return id;
	}

	public void setId(Integer id)
	{
		this.id = id;
	}

	public Integer getVersion()
	{
		return version;
	}

	public void setVersion(Integer version)
	{
		this.version = version;
	}

	public String getName()
	{
		return name;
	}

	public void setName(String name)
	{
		this.name = name;
	}

	public List<Comment> getComments()
	{
		return comments;
	}

	public void setComments(List<Comment> comments)
	{
		this.comments = comments;
	}
	
	public void addComment(Comment comment)
	{
		if (comments == null)
		{
			comments = new ArrayList<>();
		}
		
		comments.add(comment);
	}
}

and a “comment” class

@Entity
public class Comment
{
	@Id
	@GeneratedValue
	@Column(name = "id")
	private Integer id;

	@Column(name = "version")
	private Integer version;

	@Column(name = "text")
	private String text;

	@ManyToOne
	@JoinColumn(name = "user_id", foreignKey = @ForeignKey(name = "fk_comment_user"))
	private User user;

	public Comment()
	{
	}

	public Integer getId()
	{
		return id;
	}

	public void setId(Integer id)
	{
		this.id = id;
	}

	public Integer getVersion()
	{
		return version;
	}

	public void setVersion(Integer version)
	{
		this.version = version;
	}

	public String getText()
	{
		return text;
	}

	public void setText(String text)
	{
		this.text = text;
	}

	public User getUser()
	{
		return user;
	}

	public void setUser(User user)
	{
		this.user = user;
	}

}

As I’m sure you can see, each user can have zero or more comments in the database, and they share a bi-directional one-to-many relationship.

My persistence.xml file looks like this:

<persistence xmlns="http://xmlns.jcp.org/xml/ns/persistence"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence
             http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd"
	version="2.1">

	<persistence-unit name="script">
		<description>In-memory persistence unit for the JPA 2.1 Schema Generation blog post</description>
		
		<provider>org.hibernate.ejb.HibernatePersistence</provider>

		<class>com.silverbaytech.blog.jpaschema.model.User</class>
		<class>com.silverbaytech.blog.jpaschema.model.Comment</class>

		<properties>
		    <property name="javax.persistence.jdbc.driver" value="org.h2.Driver" />
			<property name="javax.persistence.jdbc.url" value="jdbc:h2:mem:jpaschema" />
		</properties>

	</persistence-unit>

	<persistence-unit name="database1">
		<description>On-disk persistence unit for the JPA 2.1 Schema Generation blog post</description>
		
		<provider>org.hibernate.ejb.HibernatePersistence</provider>

		<class>com.silverbaytech.blog.jpaschema.model.User</class>
		<class>com.silverbaytech.blog.jpaschema.model.Comment</class>

		<properties>
		    <property name="javax.persistence.jdbc.driver" value="org.h2.Driver" />
			<property name="javax.persistence.jdbc.url" value="jdbc:h2:./target/databaseExample1" />
		</properties>

	</persistence-unit>

	<persistence-unit name="database2">
		<description>On-disk persistence unit for the JPA 2.1 Schema Generation blog post</description>
		
		<provider>org.hibernate.ejb.HibernatePersistence</provider>

		<class>com.silverbaytech.blog.jpaschema.model.User</class>
		<class>com.silverbaytech.blog.jpaschema.model.Comment</class>

		<properties>
		    <property name="javax.persistence.jdbc.driver" value="org.h2.Driver" />
			<property name="javax.persistence.jdbc.url" value="jdbc:h2:./target/databaseExample2" />
		</properties>

	</persistence-unit>
		
	<persistence-unit name="unit-test1">
		<description>In-memory persistence unit for the JPA 2.1 Schema Generation blog post</description>
		
		<provider>org.hibernate.ejb.HibernatePersistence</provider>

		<class>com.silverbaytech.blog.jpaschema.model.User</class>
		<class>com.silverbaytech.blog.jpaschema.model.Comment</class>

		<properties>
		    <property name="javax.persistence.jdbc.driver" value="org.h2.Driver" />
			<property name="javax.persistence.jdbc.url" value="jdbc:h2:mem:" />
			<property name="javax.persistence.schema-generation.database.action" value="create" />
		</properties>

	</persistence-unit>
		
	<persistence-unit name="unit-test2">
		<description>In-memory persistence unit for the JPA 2.1 Schema Generation blog post</description>
		
		<provider>org.hibernate.ejb.HibernatePersistence</provider>

		<class>com.silverbaytech.blog.jpaschema.model.User</class>
		<class>com.silverbaytech.blog.jpaschema.model.Comment</class>

		<properties>
		    <property name="javax.persistence.jdbc.driver" value="org.h2.Driver" />
			<property name="javax.persistence.jdbc.url" value="jdbc:h2:./target/unitTestExample2" />
			<property name="javax.persistence.schema-generation.database.action" value="drop-and-create" />
		</properties>

	</persistence-unit>
</persistence>

I have four different persistence units all referring to the same entities – I’ll explain them as I go along.

Script Generation to a File

The most common, in my experience, type of schema generation is to be able to generate the DDL scripts for the database and save them to a file. In order to generate scripts this way, you need to do the following:

Provide a value for the javax.persistence.schema-generation.scripts.action property. Typically, you will use either create or drop-and-create, depending on whether you want just the scripts to create your tables, or whether you also want the scripts to drop the tables as well.
Provide a value for the javax.persistence.schema-generation.scripts.create-target property. This identifies the file to which you would like the “create” script written.
If you specified drop-and-create, provide a value for the javax.persistence.schema-generation.scripts.drop-target property. This specifies the path to the file where the “drop” script will be written.
Call the Persistence.generateSchema API.

The ScriptExample1 class does just this:

public class ScriptExample1
{
	public static void main(String[] args)
	{
		try
		{
			Map<String, Object> properties = new HashMap<>();

			properties.put("javax.persistence.schema-generation.scripts.action", "drop-and-create");
			properties.put(	"javax.persistence.schema-generation.scripts.create-target",
							"target/create.sql");
			properties.put(	"javax.persistence.schema-generation.scripts.drop-target",
							"target/drop.sql");

			Persistence.generateSchema("script", properties);

			System.out.println("Done");
			
			System.exit(0);
		}
		catch (Exception e)
		{
			e.printStackTrace();
		}
	}
}

I’ve set up the persistence.xml file the way one normally would for using the database. Thus, the properties I mentioned above are not included there. Instead, the persistence unit is simply defined as follows:

	<persistence-unit name="script">
		<description>In-memory persistence unit for the JPA 2.1 Schema Generation blog post</description>
		
		<provider>org.hibernate.ejb.HibernatePersistence</provider>

		<class>com.silverbaytech.blog.jpaschema.model.User</class>
		<class>com.silverbaytech.blog.jpaschema.model.Comment</class>

		<properties>
			<property name="javax.persistence.jdbc.driver" value="org.h2.Driver" />
			<property name="javax.persistence.jdbc.url" value="jdbc:h2:mem:jpaschema" />
		</properties>

	</persistence-unit>

Instead, I provide those extra properties in a Map that I pass when I call Persistence.generateSchema. The “create” script is written to target/create.sql, and ends up looking like this:

create sequence hibernate_sequence start with 1 increment by 1
create table Comment (id integer not null, text varchar(255), version integer, user_id integer, primary key (id))
create table User (id integer not null, name varchar(255) not null, version integer, primary key (id))
alter table Comment add constraint fk_comment_user foreign key (user_id) references User

Nothing really surprising there. Similarly, the “drop” script looks like:

drop table Comment if exists
drop table User if exists
drop sequence if exists hibernate_sequence

Obviously, the “drops” have been generated in the opposite order of the “creates”.

In doing this, I came across what I believe to be a bug in Hibernate – when you execute the class above, if you omit the System.exit(0), the JVM never exits. I suspect that Hibernate’s code is leaving a connection to the database open, which results in the low-level database code never exiting. I’m probably going to follow up on that with them later.

Script Generation to Memory

But what if you want to generate your scripts into memory, instead of writing them out to a file? (This was actually the task that resulted in the creation of this post.) Many of the tutorials out there fail to mention this, but if you read the specification, instead of providing the script targets as strings, you can pass a Java Writer in the Map. This is what ScriptExample2 does:

public class ScriptExample2
{
	public static void main(String[] args)
	{
		try
		{
			Map<String, Object> properties = new HashMap<>();

			StringWriter create = new StringWriter();
			StringWriter drop = new StringWriter();

			properties.put("javax.persistence.schema-generation.scripts.action", "drop-and-create");
			properties.put(	"javax.persistence.schema-generation.scripts.create-target",
							create);
			properties.put(	"javax.persistence.schema-generation.scripts.drop-target",
							drop);

			Persistence.generateSchema("script", properties);
			
			System.out.println("Create script:");
			System.out.println(create.toString());
			
			System.out.println("Drop script:");
			System.out.println(drop.toString());

			System.out.println("Done");

			System.exit(0);
		}
		catch (Exception e)
		{
			e.printStackTrace();
		}
	}
}

Generating Without a Database Connection

Both of the programs above figure out the correct Hibernate dialect by connecting to, and interrogating, the database. H2 has a feature where if you use an in-memory database URL, it creates the database when you connect to it. I made use of that particular feature above.

In theory, if you don’t have a connection to the database, you can provide the persistence provider with the following parameters instead:

javax.persistence.database-product-name
javax.persistence.database-major-version
javax.persistence.database-minor-version

You can obtain these values via JDBC. Although I’ve provided a class called DatabaseInfo in the sample code that obtains these values given a JDBC URL, it appears that this doesn’t work with Hibernate – if you don’t have the appropriate URL, Hibernate wants an explicit Dialect instead. Oh well, so much for perfect portability.

Generating a Schema When Starting a Persistence Unit

The options above will generate the DDL that you can then apply to your database server. JPA also permits you to create a schema and have it applied to the database when you use the Persistence.createEntityManagerFactory API. In web-based environments, I don’t consider this terribly useful – in most of the scenarios that I’m familiar with, you generally want your schemas to be in files so that you can apply them to the database server yourself without the danger of nuking all your hard-earned data.

I can, however, see two situations in which this could be useful:

If you have a standalone application implemented in Java that uses a database as its underlying store, and you want to init the database the first time the application is run after being installed.
You’re using an in-memory database in order to perform automated testing on some of your software.

Here, instead of the javax.persistence.schema-generation.scripts.action property you use the javax.persistence.schema-generation.database.action property.

Booting a Database for a Standalone Java Application

DatabaseSample1 provides an example of this.

public class DatabaseExample1
{
	private static final String DATABASE_FILE = "./target/databaseExample1.mv.db";
	
	public static void main(String[] args)
	{
		try
		{
			Map<String, Object> properties = new HashMap<>();
			
			Path databasePath = FileSystems.getDefault().getPath(DATABASE_FILE);
			if (!Files.exists(databasePath))
			{
				System.out.println("Database does not exist");
				properties.put("javax.persistence.schema-generation.database.action", "create");
			}
			else
			{
				System.out.println("Database already exists");
			}
			
			EntityManagerFactory factory = Persistence.createEntityManagerFactory("database1", properties);
			
			EntityManager em = factory.createEntityManager();
			
			em.getTransaction().begin();
			
			User user = new User();
			user.setName("user" + System.currentTimeMillis());
			em.persist(user);
			
			Comment comment1 = new Comment();
			comment1.setText("A comment");
			comment1.setUser(user);
			user.addComment(comment1);
			
			Comment comment2 = new Comment();
			comment2.setText("Another comment");
			comment2.setUser(user);
			user.addComment(comment2);
			
			em.persist(comment1);
			em.persist(comment2);
			
			em.getTransaction().commit();
			
			em.getTransaction().begin();
			
			Query query1 = em.createQuery("select count(*) from " + User.class.getName());
			Long count1 = (Long)query1.getSingleResult();
			Query query2 = em.createQuery("select count(*) from " + Comment.class.getName());
			Long count2 = (Long)query2.getSingleResult();
			
			em.getTransaction().commit();
			em.close();

			System.out.println(count1.toString() + " users");
			System.out.println(count2.toString() + " comments");
		}
		catch (Exception e)
		{
			e.printStackTrace();
		}
		
		System.exit(0);
	}
}

When this code is run, it checks to see if the file (the filename is based on the database name specified in the database1 persistence unit) already exists. If it does not, it adds the property to cause JPA to generate the schema. (H2 will create the database file automatically, but relies on us to create the schema within it.) If the database file already exists, it omits this step. Thus, the first time you run this program it will create the database and add a user and two comments to it. Subsequent runs (unless you delete the database file) will simply append to the existing database.

Including Extra SQL

Sometimes the entire schema that you want isn’t represented by just the list of entities you created. The JPA specification allows you to “mix in” SQL you write yourself. DatabaseExample2 shows this.

public class DatabaseExample2
{
	private static final String DATABASE_FILE = "./target/databaseExample2.mv.db";
	
	public static void main(String[] args)
	{
		try
		{
			Map<String, Object> properties = new HashMap<>();
			
			Path databasePath = FileSystems.getDefault().getPath(DATABASE_FILE);
			if (!Files.exists(databasePath))
			{
				System.out.println("Database does not exist");
				properties.put("javax.persistence.schema-generation.database.action", "create");
				properties.put("javax.persistence.schema-generation.create-source", "script-then-metadata");
				properties.put("javax.persistence.schema-generation.create-script-source", "./src/main/sql/additionalCreate.sql");
			}
			else
			{
				System.out.println("Database already exists");
			}
			
			EntityManagerFactory factory = Persistence.createEntityManagerFactory("database2", properties);
			
			EntityManager em = factory.createEntityManager();
			
			em.getTransaction().begin();
			
			User user = new User();
			user.setName("user" + System.currentTimeMillis());
			em.persist(user);
			
			Comment comment1 = new Comment();
			comment1.setText("A comment");
			comment1.setUser(user);
			user.addComment(comment1);
			
			Comment comment2 = new Comment();
			comment2.setText("Another comment");
			comment2.setUser(user);
			user.addComment(comment2);
			
			em.persist(comment1);
			em.persist(comment2);
			
			em.getTransaction().commit();
			
			em.getTransaction().begin();
			
			Query query1 = em.createQuery("select count(*) from " + User.class.getName());
			Long count1 = (Long)query1.getSingleResult();
			Query query2 = em.createQuery("select count(*) from " + Comment.class.getName());
			Long count2 = (Long)query2.getSingleResult();
			Query query3 = em.createNativeQuery("select count(*) from additional");
			BigInteger count3 = (BigInteger)query3.getSingleResult();
			
			em.getTransaction().commit();
			em.close();

			System.out.println(count1.toString() + " users");
			System.out.println(count2.toString() + " comments");
			System.out.println(count3.toString() + " additional entries");
		}
		catch (Exception e)
		{
			e.printStackTrace();
		}
		
		System.exit(0);
	}
}

Here, I’ve created an additional SQL file additionalCreate.sql that I want to run before the automatically-generated Entity code is run. I get this to happen by setting javax.persistence.schema-generation.create-source to script-then-metadata. (The default is just metadata. A metadata-then-script option is also available.) Thus, now when I initially boot my database, the additionalCreate.sql file is run, which creates a table named additional. The program logic then access this table using native SQL.

Booting an In-Memory Database for Testing

Booting up a database for automated software testing is another application of automatic schema generation. The UnitTestExample1 shows an example of doing this with an in-memory H2 database.

public class UnitTestExample1
{
	private EntityManagerFactory entityManagerFactory;
	private EntityManager entityManager;
	
	@Before
	public void setup()
	{
		entityManagerFactory = Persistence.createEntityManagerFactory("unit-test1");
		entityManager = entityManagerFactory.createEntityManager();
	}
	
	@After
	public void tearDown()
	{
		if (entityManager != null)
		{
			if (entityManager.isOpen())
			{
				entityManager.close();
			}
		}
		
		if (entityManagerFactory != null)
		{
			if (entityManagerFactory.isOpen())
			{
				entityManagerFactory.close();
			}
		}
	}
	
	private long getCount(Class<?> clazz)
	{
		entityManager.getTransaction().begin();
		Long result = (Long)entityManager.createQuery("select count(*) from " + clazz.getName()).getSingleResult();
		entityManager.getTransaction().commit();
		return result;
	}
	
	@Test
	public void databaseStartsEmpty()
	{
		assertThat(getCount(User.class), equalTo(0L));
		assertThat(getCount(Comment.class), equalTo(0L));
	}
	
	@Test
	public void canInsertItems()
	{
		entityManager.getTransaction().begin();
		
		User user = new User();
		user.setName("user" + System.currentTimeMillis());
		entityManager.persist(user);
		
		Comment comment1 = new Comment();
		comment1.setText("A comment");
		comment1.setUser(user);
		user.addComment(comment1);
		
		Comment comment2 = new Comment();
		comment2.setText("Another comment");
		comment2.setUser(user);
		user.addComment(comment2);
		
		entityManager.persist(comment1);
		entityManager.persist(comment2);
		
		entityManager.getTransaction().commit();
		
		assertThat(getCount(User.class), equalTo(1L));
		assertThat(getCount(Comment.class), equalTo(2L));
	}
}

In this case, I have included the property telling JPA to create the schema right in the persistence.xml:

	<persistence-unit name="unit-test1">
		<description>In-memory persistence unit for the JPA 2.1 Schema Generation blog post</description>
		
		<provider>org.hibernate.ejb.HibernatePersistence</provider>

		<class>com.silverbaytech.blog.jpaschema.model.User</class>
		<class>com.silverbaytech.blog.jpaschema.model.Comment</class>

		<properties>
		    <property name="javax.persistence.jdbc.driver" value="org.h2.Driver" />
			<property name="javax.persistence.jdbc.url" value="jdbc:h2:mem:" />
			<property name="javax.persistence.schema-generation.database.action" value="create" />
		</properties>

	</persistence-unit>

When the unit test code calls Persistence.createEntityManagerFactory, H2 will create a new database and apply the schema to it. H2 creates a new database with every call to Persistence.createEntityManagerFactory because I didn’t name the database instance in the javax.persistence.jdbc.url property. This is typically a good thing for testing purposes, as it guarantees that every test begins with a completely empty database, regardless of what happened with the previous test. Similarly, H2’s behavior is such that then the last connection is closed (which happens when the EntityManager.close call is made) H2 destroys that database, so we don’t leave things behind.

Booting an Physical Database for Testing

But maybe you can’t (or don’t want to) use an in-memory database. Perhaps, instead, you need to use a local database for integration testing, and are worried about data from one test “leaking” to another. This is a use for another variant on in-database schema creation.

By setting javax.persistence.schema-generation.database.action to drop-and-create instead of just create, you can ask JPA to first drop all the tables in the database corresponding to your schema and then re-create the schema. UnitTestExample2 shows this. Here, I’ve used a different persistence unit that employs an on-disk H2 database, and modified the javax.persistence.schema-generation.database.action property in the persistence unit accordingly. UnitTestExample2 is exactly the same as UnitTestExample1, except that it uses this different persistence unit. Just as before, however, the call to Persistence.createEntityManagerFactory causes the individual test to be started with a nice, empty database, even if the previous test left behind data.

Of course, I could just as easily applied the javax.persistence.schema-generation.database.action property in the code that fired up the database for unit testing, leaving the persistence unit in the form I would normally use for accessing the database in production.

Note that you can combine drop-and-create with the option shown above in Including Extra SQL – all you do is provide a “drop” script in javax.persistence.schema-generation.drop-script-source in addition to a “create” script, and specify either metadata-then-script or script-then-metadata as the value of javax.persistence.schema-generation.drop-source.

Summary

So that’s a fairly complete tour of the JPA 2.1 schema generation landscape. Happy ORM’ing!

Followup

One item that isn’t addressed in these examples is “pretty printing” the generated DDL. See my follow-up post JPA 2.1 Schema Generation – Formatted Output for an example of how to do that.

The article JPA 2.1 Schema Generation originally appeared on the SilverBayTech.com Blog.