4 Replies Latest reply on Jan 17, 2006 10:18 AM by saviola

Why is the ID of ProcessInstance of type long?

saviola Jan 17, 2006 5:20 AM

Hello, jBPM-ers!
I probably have a strange question.
We are using jBPM for almost a year in my company. Maybe more than 10 times we had asked each other one and the same question: Why is the ID of ProcessInstance of type long but not of type int?
The maximum positive int number is 2147483647. And in my opinion this number is quite enough to fit all the process instances needed.
Any way, no matter of my reasons I am still looking for an answer of this question.
Thanks a lot to anyone (maybe from the jBPM team) who can give well-grounded answer.

Regards,
Saviola

1. Re: Why is the ID of ProcessInstance of type long?

ralfoeldi Jan 17, 2006 5:25 AM (in response to saviola)

Hi Saviola,

I'm using up to 5000000 sequence values per day. int would give me 429 days until I run out.

Thats a pretty good reason for using long instead of int.

Whats your problem with int?

Greetings

Rainer
Actions
2. Re: Why is the ID of ProcessInstance of type long?

saviola Jan 17, 2006 7:47 AM (in response to saviola)

Hi, Rainer!
I don't have specific problem with int. It was just a question of principle - why is the long selected instead of int.
What are you using 5 000 000 sequence values per day? This makes 57 sequence values per second.
Are these Process Instances only or inserts in the database.
What strategy for ID generation do you use?
What is the disk size of the computer where your database server is?
One more question at the end: Is this system in production where you have 5 000 000 sequence values or you are just testing?
I would like you to know that I am not argueing with you :) I just want to know what the environment is :)

Best regards,
Saviola
Actions
3. Re: Why is the ID of ProcessInstance of type long?

ralfoeldi Jan 17, 2006 8:31 AM (in response to saviola)

Hi Saviola,

ID generation is standard jBPM or Hibernate. If you take a look at the db tables you'll see that the sequence is used for everyting. Logs, variables, processDefinitions, processInstances, tokens, whatever. So its not really that hard to get to 5000000.

Sequence value use does not mean that all this data is persisted indefinately. All logs etc. are held for 7 days and then deleted. So its not really a question of db size. I think the tablespace is set to something like 600 mb - 1 gb.

The system is 14 days short of production, so if I vanish from this screen in two weeks you know something went badly wrong and I'm in hiding.

And as a question of principle: numbers are not a scare resource. Last time I checked we have a unlimited supply. And you only get into trouble if you assume a system won't reach it's limits. The archiving / document management system I'm integrating here reuses queue IDs after entries are removed from the queue. It took me three days to figure that one out. I just didn't think anybody could be that stupid. (They are using alphanumeric ids with a huge range of values so there was absolutely no reason to do this.)

Ok this turned into a rant... but I think you get my point.

Greetings

Rainer
Actions
4. Re: Why is the ID of ProcessInstance of type long?

saviola Jan 17, 2006 10:18 AM (in response to saviola)

Got it :)
I'm sure others have asked themselves the question about whether long is necessary too. It's good we tried to clarify it.

Just a final comment: sequence overflow in the db is not that frightening, it's the uniqueness that limits us. So if you regularly delete old data, you can still reuse ids. I'm sure you'll never run out of ids, even if they were ints, because 4 billion records * 4 bytes each = 16GB, only for the ids. A lot more than a 1GB tablespace. And now, since JBPM uses longs, you use twice as much space for the ids (8 bytes each). Also, ints are better and faster on 32-bit archs.

If I go for ints, I may try to somehow modify the hibernate mappings. But let's leave this fruitless chat, anyway...

Good luck with deployment!

Cheers,
Saviola
Actions

Go to original post