Guest Post: Lessons Learned from Bulk Parsing and Database Insertion
04 Dec 2011
This article originates from christophersaunders.ca
I'm beginning to work on an STO application from an API that was created by Philippe Casgrain. The data was stored in a pretty big JSON file, and I decided that instead of breaking it into smaller pieces I would simply insert everything into an SQLite database. Previously the only time I had worked with Android databases had simply been via insertions with ContentValues, which is a lot slower than I had thought at first.
Previously this is what I was doing
void insertData(List objects, SQLiteDatabase db) {
ContentValues cvs = new ContentValues();
for(Foo obj : objects){
cvs.put("FooStringColumn", obj.fooString());
cvs.put("FooIntegerColumn", obj.fooInteger());
long result = db.insertWithOnConflict("Foo", null, cvs, SQLiteDatabase.CONFLICT_REPLACE);
}
}
This works alright, though when you are generating thousands of entries from your JSON data it results in something that's so slow it's painful. At first I thought my problems were stemming from using the org.json
tools included in Android so I switched over to using Google GSON instead, which did help reduce a ton of unnecessary garbage collection. Though, everything was still slow.
My database knowledge isn't extremely deep, and I recall being able to compile PreparedStatements to speed up interaction with the DB. After a bit of searching on the internet I came across an alternative way of doing database insertion using this thing known as an InsertHelper. Using an InsertHelper is pretty straightforward, though it requires writing a bit more code than ContentValues, but it's way faster.
void insertData(List objects, SQLiteDatabase db){
InsertHelper helper = new InsertHelper(db, "Foo");
final int fooStrCol = helper.getColumnIndex("FooStringColumn");
final int fooIntCol = helper.getColumnIndex("FooIntegerColumn");
for(Foo obj : objects){
helper.prepareForInsert();
helper.bind(fooStrCol, obj.fooString());
helper.bind(fooIntCol, obj.fooInteger());
helper.execute();
}
}
To also help speed things up, I extracted all of my data first then inserted everything in a transaction. This helped reduce overhead and I was able to get the database initialized in about 10 seconds. Surely I can speed it up further, but for now this seems to solve my problems.
References