Identity column re-seed: when it is necessary?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty{ margin-bottom:0;
}






up vote
9
down vote

favorite












During one of the last lessons at university (I'm a student), lecturer asked us to develop a database (MySQL Server if it matters) and tiny client app that would consume the database as data source.



One of requirements was that the identity column (which is the PK in every table) must be sequential, because it is a good practice (as per lecturer words). That is, when table row is deleted, it's PK must be reused in subsequent inserts. I have average knowledge in RDBMS, PKs and identity columns. From what I understand, that identity column is just a way to let DB to auto-generate PKs when inserting rows and nothing more. And identity column value shall not be related to row attributes in any way (as long as it is not natural key).



This requirement (strictly sequential identity column) was suspicious to me. I tried to ask the lecturer what is wrong if identity is not sequential (with gaps caused by deletions), but got very abstract answer like "it is convenient for users and useful for DB administrators who maintain the database". No specific examples. The argument "convenient for users" sounds silly, because it doesn't have any meaning in business domain.



Therefore I'm curious if these reasons are real? I can think only of one case when identity column reseed is required -- when identity space is exhausted. But this is more design issue when identity column type was chosen incorrectly, say simple int instead of bigint or uniqueidentifier when table contains billion rows. Suppose, an identity column is a clustered index: can gaps in identity column affect index performance? Maybe there are other real-world reasons for automatic identity column re-seed after each delete I'm not aware of?



Thanks in advance!










share|improve this question




















  • 2




    Gaps don't affect clustered indexes, these are b-trees of a form anyway. Old (still relevant) rant: Shooting yourself in the foot: renumbering auto-inc IDs
    – danblack
    2 days ago






  • 1




    Database clusters, Galera and multiple active-write db servers, also use auto_increment_(increment|offset) (ref: dev.mysql.com/doc/refman/5.7/en/…) as a mechanism to prevent conflicts which explicitly configures servers to have gaps in namespace.
    – danblack
    2 days ago












  • "One of requirements was that the identity column (which is the PK in every table) must be sequential, because it is a good practice (as per lecturer words)." media.tenor.com/images/5442002c8081300e02cdb610e70a5460/… This is a well known as an anti-pattern.
    – jpmc26
    yesterday



















up vote
9
down vote

favorite












During one of the last lessons at university (I'm a student), lecturer asked us to develop a database (MySQL Server if it matters) and tiny client app that would consume the database as data source.



One of requirements was that the identity column (which is the PK in every table) must be sequential, because it is a good practice (as per lecturer words). That is, when table row is deleted, it's PK must be reused in subsequent inserts. I have average knowledge in RDBMS, PKs and identity columns. From what I understand, that identity column is just a way to let DB to auto-generate PKs when inserting rows and nothing more. And identity column value shall not be related to row attributes in any way (as long as it is not natural key).



This requirement (strictly sequential identity column) was suspicious to me. I tried to ask the lecturer what is wrong if identity is not sequential (with gaps caused by deletions), but got very abstract answer like "it is convenient for users and useful for DB administrators who maintain the database". No specific examples. The argument "convenient for users" sounds silly, because it doesn't have any meaning in business domain.



Therefore I'm curious if these reasons are real? I can think only of one case when identity column reseed is required -- when identity space is exhausted. But this is more design issue when identity column type was chosen incorrectly, say simple int instead of bigint or uniqueidentifier when table contains billion rows. Suppose, an identity column is a clustered index: can gaps in identity column affect index performance? Maybe there are other real-world reasons for automatic identity column re-seed after each delete I'm not aware of?



Thanks in advance!










share|improve this question




















  • 2




    Gaps don't affect clustered indexes, these are b-trees of a form anyway. Old (still relevant) rant: Shooting yourself in the foot: renumbering auto-inc IDs
    – danblack
    2 days ago






  • 1




    Database clusters, Galera and multiple active-write db servers, also use auto_increment_(increment|offset) (ref: dev.mysql.com/doc/refman/5.7/en/…) as a mechanism to prevent conflicts which explicitly configures servers to have gaps in namespace.
    – danblack
    2 days ago












  • "One of requirements was that the identity column (which is the PK in every table) must be sequential, because it is a good practice (as per lecturer words)." media.tenor.com/images/5442002c8081300e02cdb610e70a5460/… This is a well known as an anti-pattern.
    – jpmc26
    yesterday















up vote
9
down vote

favorite









up vote
9
down vote

favorite











During one of the last lessons at university (I'm a student), lecturer asked us to develop a database (MySQL Server if it matters) and tiny client app that would consume the database as data source.



One of requirements was that the identity column (which is the PK in every table) must be sequential, because it is a good practice (as per lecturer words). That is, when table row is deleted, it's PK must be reused in subsequent inserts. I have average knowledge in RDBMS, PKs and identity columns. From what I understand, that identity column is just a way to let DB to auto-generate PKs when inserting rows and nothing more. And identity column value shall not be related to row attributes in any way (as long as it is not natural key).



This requirement (strictly sequential identity column) was suspicious to me. I tried to ask the lecturer what is wrong if identity is not sequential (with gaps caused by deletions), but got very abstract answer like "it is convenient for users and useful for DB administrators who maintain the database". No specific examples. The argument "convenient for users" sounds silly, because it doesn't have any meaning in business domain.



Therefore I'm curious if these reasons are real? I can think only of one case when identity column reseed is required -- when identity space is exhausted. But this is more design issue when identity column type was chosen incorrectly, say simple int instead of bigint or uniqueidentifier when table contains billion rows. Suppose, an identity column is a clustered index: can gaps in identity column affect index performance? Maybe there are other real-world reasons for automatic identity column re-seed after each delete I'm not aware of?



Thanks in advance!










share|improve this question















During one of the last lessons at university (I'm a student), lecturer asked us to develop a database (MySQL Server if it matters) and tiny client app that would consume the database as data source.



One of requirements was that the identity column (which is the PK in every table) must be sequential, because it is a good practice (as per lecturer words). That is, when table row is deleted, it's PK must be reused in subsequent inserts. I have average knowledge in RDBMS, PKs and identity columns. From what I understand, that identity column is just a way to let DB to auto-generate PKs when inserting rows and nothing more. And identity column value shall not be related to row attributes in any way (as long as it is not natural key).



This requirement (strictly sequential identity column) was suspicious to me. I tried to ask the lecturer what is wrong if identity is not sequential (with gaps caused by deletions), but got very abstract answer like "it is convenient for users and useful for DB administrators who maintain the database". No specific examples. The argument "convenient for users" sounds silly, because it doesn't have any meaning in business domain.



Therefore I'm curious if these reasons are real? I can think only of one case when identity column reseed is required -- when identity space is exhausted. But this is more design issue when identity column type was chosen incorrectly, say simple int instead of bigint or uniqueidentifier when table contains billion rows. Suppose, an identity column is a clustered index: can gaps in identity column affect index performance? Maybe there are other real-world reasons for automatic identity column re-seed after each delete I'm not aware of?



Thanks in advance!







mysql identity auto-increment






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 2 days ago









Rick James

39.7k22256




39.7k22256










asked 2 days ago









Crypt32

1485




1485








  • 2




    Gaps don't affect clustered indexes, these are b-trees of a form anyway. Old (still relevant) rant: Shooting yourself in the foot: renumbering auto-inc IDs
    – danblack
    2 days ago






  • 1




    Database clusters, Galera and multiple active-write db servers, also use auto_increment_(increment|offset) (ref: dev.mysql.com/doc/refman/5.7/en/…) as a mechanism to prevent conflicts which explicitly configures servers to have gaps in namespace.
    – danblack
    2 days ago












  • "One of requirements was that the identity column (which is the PK in every table) must be sequential, because it is a good practice (as per lecturer words)." media.tenor.com/images/5442002c8081300e02cdb610e70a5460/… This is a well known as an anti-pattern.
    – jpmc26
    yesterday
















  • 2




    Gaps don't affect clustered indexes, these are b-trees of a form anyway. Old (still relevant) rant: Shooting yourself in the foot: renumbering auto-inc IDs
    – danblack
    2 days ago






  • 1




    Database clusters, Galera and multiple active-write db servers, also use auto_increment_(increment|offset) (ref: dev.mysql.com/doc/refman/5.7/en/…) as a mechanism to prevent conflicts which explicitly configures servers to have gaps in namespace.
    – danblack
    2 days ago












  • "One of requirements was that the identity column (which is the PK in every table) must be sequential, because it is a good practice (as per lecturer words)." media.tenor.com/images/5442002c8081300e02cdb610e70a5460/… This is a well known as an anti-pattern.
    – jpmc26
    yesterday










2




2




Gaps don't affect clustered indexes, these are b-trees of a form anyway. Old (still relevant) rant: Shooting yourself in the foot: renumbering auto-inc IDs
– danblack
2 days ago




Gaps don't affect clustered indexes, these are b-trees of a form anyway. Old (still relevant) rant: Shooting yourself in the foot: renumbering auto-inc IDs
– danblack
2 days ago




1




1




Database clusters, Galera and multiple active-write db servers, also use auto_increment_(increment|offset) (ref: dev.mysql.com/doc/refman/5.7/en/…) as a mechanism to prevent conflicts which explicitly configures servers to have gaps in namespace.
– danblack
2 days ago






Database clusters, Galera and multiple active-write db servers, also use auto_increment_(increment|offset) (ref: dev.mysql.com/doc/refman/5.7/en/…) as a mechanism to prevent conflicts which explicitly configures servers to have gaps in namespace.
– danblack
2 days ago














"One of requirements was that the identity column (which is the PK in every table) must be sequential, because it is a good practice (as per lecturer words)." media.tenor.com/images/5442002c8081300e02cdb610e70a5460/… This is a well known as an anti-pattern.
– jpmc26
yesterday






"One of requirements was that the identity column (which is the PK in every table) must be sequential, because it is a good practice (as per lecturer words)." media.tenor.com/images/5442002c8081300e02cdb610e70a5460/… This is a well known as an anti-pattern.
– jpmc26
yesterday












3 Answers
3






active

oldest

votes

















up vote
13
down vote



accepted











That is, when table row is deleted, it's PK must be reused in subsequent inserts.




What universe is your lecturer from??



That is grossly inefficient. If you try to do that, you will cut your performance prospects down by a factor of 10.



If you need gapless numbers for auditing reasons, build them explicitly, not directly from database tools. And never delete rows, but flag them as "deleted". This will add to the messiness of queries, since they will have to ignore such rows.



In MySQL, InnoDB requires the existence of a unique PRIMARY KEY for each table. But that is the extent of the requirement. The key can even be a string.



Gaps are a convenience to the users and DBAs, not an inconvenience.



I can think of one case where gapless would be convenient -- chunking into groups of 100 rows at a time. But there is a simple workaround using LIMIT 100,1.



Gaps have zero impact on performance. That includes non-numeric indexes. And non-unique indexes. And composite indexes.



Sure, you can run out of ids. I think I have seen it happen twice in nearly 2 decades of using MySQL. I may as well worry about being struck by an asteroid. It's low on my things-that-keep-me-awake-at-night list.



Gaps occur from (at least): INSERT IGNORE, IODKU, REPLACE, DELETE, ROLLBACK (explicit, or due to crash), Multi-master replication (including Galera and Group Replication). Do you really want to come up with workarounds for those?!



Feel free to have us sanity-check anything else that lecturer says that is suspicious.






share|improve this answer























  • Don’t ask. I feel that her statement is not very correct, but don’t have arguments to argue this why I’m asking.
    – Crypt32
    2 days ago






  • 1




    @Crypt32 - I added a bunch more. I think this is a question that needs to be tackled the other way. The lecturer should provide coherent claims; then we can shoot them down or suggest that those claims do not justify the complexity of reusing ids. I suggest that they need to provide sample code on how to identify and reuse the ids; again so we can point out how messy that will be.
    – Rick James
    2 days ago










  • I tried to ask again during intermission, but all I got was what I said in my question.
    – Crypt32
    2 days ago






  • 1




    Consider that the lecturer's requirement may not have a grounding on real database reasons, but rather, stems from the lecturer wanting to put the students through hoops so that they go through the experience of being forced to tackle difficult algorithms.
    – João Mendes
    yesterday






  • 1




    @Vilx- - That is fine for only one of my half-dozen cases where gaps happen.
    – Rick James
    yesterday


















up vote
6
down vote













Reusing an identity value, should in general be discouraged. Either the value is used entirely internally, in which case it’s actual value is immaterial, or it is also used externally in which case reusing the value is very likely going to lead to a misidentification.



Take the obvious case of an invoice or purchase order number, these might easily come from an identity column and be exposed externally, but you would never want to reuse them for precisely that reason. Both refer to specific transactions which you would not want to get confused.



Resolving such issues can be a big hassle when companies merge or are acquired. Creating such problems on purpose? Not wise.






share|improve this answer





















  • @Crypt32 - The "invoice number" argument may be the most compelling one to throw at your lecturer.
    – Rick James
    yesterday










  • Good example, I didn't think in this way. Bookmarking a link to a resource with its exposed ID would point to a totally different resource after reuse.
    – Crypt32
    yesterday


















up vote
4
down vote













The reuse of PK id values has problems and generally should be avoided.



First, the implementation of auto_increment columns doesn't provide the guarantee of being gapless. Indeed gaps will occur if you rollback an insert on an auto increment column.



Secondly the gap ID may refer to existing data that hasn't been deleted (due to missing FK constrains). If they translate to member numbers communicated outside the system then that poses potential business identity risks.



Thirdly, bigint unsigned won't run out of IDs for a significant time even given an extremely large insert rate.



The biggest pain with gaps is coming across auditors who insist its an audit flaw. For DBAs they know gaps exist and why.






share|improve this answer





















  • Interesting sentence about auditors. How they explain gaps as flaw? “Rollback an insert”, you mean when insert fails due to some reasons?
    – Crypt32
    2 days ago






  • 1




    Rollback an insert is either failure or user instigated like: BEGIN; INSERT INTO ... ; {some other insert/update}; ROLLBACK
    – danblack
    2 days ago






  • 1




    Auditors just use similar terms around IDs to DBAs, just fairly different business meanings. Each is logical to their own view on the world. Usually resolvable with good communication.
    – danblack
    2 days ago











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223392%2fidentity-column-re-seed-when-it-is-necessary%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
13
down vote



accepted











That is, when table row is deleted, it's PK must be reused in subsequent inserts.




What universe is your lecturer from??



That is grossly inefficient. If you try to do that, you will cut your performance prospects down by a factor of 10.



If you need gapless numbers for auditing reasons, build them explicitly, not directly from database tools. And never delete rows, but flag them as "deleted". This will add to the messiness of queries, since they will have to ignore such rows.



In MySQL, InnoDB requires the existence of a unique PRIMARY KEY for each table. But that is the extent of the requirement. The key can even be a string.



Gaps are a convenience to the users and DBAs, not an inconvenience.



I can think of one case where gapless would be convenient -- chunking into groups of 100 rows at a time. But there is a simple workaround using LIMIT 100,1.



Gaps have zero impact on performance. That includes non-numeric indexes. And non-unique indexes. And composite indexes.



Sure, you can run out of ids. I think I have seen it happen twice in nearly 2 decades of using MySQL. I may as well worry about being struck by an asteroid. It's low on my things-that-keep-me-awake-at-night list.



Gaps occur from (at least): INSERT IGNORE, IODKU, REPLACE, DELETE, ROLLBACK (explicit, or due to crash), Multi-master replication (including Galera and Group Replication). Do you really want to come up with workarounds for those?!



Feel free to have us sanity-check anything else that lecturer says that is suspicious.






share|improve this answer























  • Don’t ask. I feel that her statement is not very correct, but don’t have arguments to argue this why I’m asking.
    – Crypt32
    2 days ago






  • 1




    @Crypt32 - I added a bunch more. I think this is a question that needs to be tackled the other way. The lecturer should provide coherent claims; then we can shoot them down or suggest that those claims do not justify the complexity of reusing ids. I suggest that they need to provide sample code on how to identify and reuse the ids; again so we can point out how messy that will be.
    – Rick James
    2 days ago










  • I tried to ask again during intermission, but all I got was what I said in my question.
    – Crypt32
    2 days ago






  • 1




    Consider that the lecturer's requirement may not have a grounding on real database reasons, but rather, stems from the lecturer wanting to put the students through hoops so that they go through the experience of being forced to tackle difficult algorithms.
    – João Mendes
    yesterday






  • 1




    @Vilx- - That is fine for only one of my half-dozen cases where gaps happen.
    – Rick James
    yesterday















up vote
13
down vote



accepted











That is, when table row is deleted, it's PK must be reused in subsequent inserts.




What universe is your lecturer from??



That is grossly inefficient. If you try to do that, you will cut your performance prospects down by a factor of 10.



If you need gapless numbers for auditing reasons, build them explicitly, not directly from database tools. And never delete rows, but flag them as "deleted". This will add to the messiness of queries, since they will have to ignore such rows.



In MySQL, InnoDB requires the existence of a unique PRIMARY KEY for each table. But that is the extent of the requirement. The key can even be a string.



Gaps are a convenience to the users and DBAs, not an inconvenience.



I can think of one case where gapless would be convenient -- chunking into groups of 100 rows at a time. But there is a simple workaround using LIMIT 100,1.



Gaps have zero impact on performance. That includes non-numeric indexes. And non-unique indexes. And composite indexes.



Sure, you can run out of ids. I think I have seen it happen twice in nearly 2 decades of using MySQL. I may as well worry about being struck by an asteroid. It's low on my things-that-keep-me-awake-at-night list.



Gaps occur from (at least): INSERT IGNORE, IODKU, REPLACE, DELETE, ROLLBACK (explicit, or due to crash), Multi-master replication (including Galera and Group Replication). Do you really want to come up with workarounds for those?!



Feel free to have us sanity-check anything else that lecturer says that is suspicious.






share|improve this answer























  • Don’t ask. I feel that her statement is not very correct, but don’t have arguments to argue this why I’m asking.
    – Crypt32
    2 days ago






  • 1




    @Crypt32 - I added a bunch more. I think this is a question that needs to be tackled the other way. The lecturer should provide coherent claims; then we can shoot them down or suggest that those claims do not justify the complexity of reusing ids. I suggest that they need to provide sample code on how to identify and reuse the ids; again so we can point out how messy that will be.
    – Rick James
    2 days ago










  • I tried to ask again during intermission, but all I got was what I said in my question.
    – Crypt32
    2 days ago






  • 1




    Consider that the lecturer's requirement may not have a grounding on real database reasons, but rather, stems from the lecturer wanting to put the students through hoops so that they go through the experience of being forced to tackle difficult algorithms.
    – João Mendes
    yesterday






  • 1




    @Vilx- - That is fine for only one of my half-dozen cases where gaps happen.
    – Rick James
    yesterday













up vote
13
down vote



accepted







up vote
13
down vote



accepted







That is, when table row is deleted, it's PK must be reused in subsequent inserts.




What universe is your lecturer from??



That is grossly inefficient. If you try to do that, you will cut your performance prospects down by a factor of 10.



If you need gapless numbers for auditing reasons, build them explicitly, not directly from database tools. And never delete rows, but flag them as "deleted". This will add to the messiness of queries, since they will have to ignore such rows.



In MySQL, InnoDB requires the existence of a unique PRIMARY KEY for each table. But that is the extent of the requirement. The key can even be a string.



Gaps are a convenience to the users and DBAs, not an inconvenience.



I can think of one case where gapless would be convenient -- chunking into groups of 100 rows at a time. But there is a simple workaround using LIMIT 100,1.



Gaps have zero impact on performance. That includes non-numeric indexes. And non-unique indexes. And composite indexes.



Sure, you can run out of ids. I think I have seen it happen twice in nearly 2 decades of using MySQL. I may as well worry about being struck by an asteroid. It's low on my things-that-keep-me-awake-at-night list.



Gaps occur from (at least): INSERT IGNORE, IODKU, REPLACE, DELETE, ROLLBACK (explicit, or due to crash), Multi-master replication (including Galera and Group Replication). Do you really want to come up with workarounds for those?!



Feel free to have us sanity-check anything else that lecturer says that is suspicious.






share|improve this answer















That is, when table row is deleted, it's PK must be reused in subsequent inserts.




What universe is your lecturer from??



That is grossly inefficient. If you try to do that, you will cut your performance prospects down by a factor of 10.



If you need gapless numbers for auditing reasons, build them explicitly, not directly from database tools. And never delete rows, but flag them as "deleted". This will add to the messiness of queries, since they will have to ignore such rows.



In MySQL, InnoDB requires the existence of a unique PRIMARY KEY for each table. But that is the extent of the requirement. The key can even be a string.



Gaps are a convenience to the users and DBAs, not an inconvenience.



I can think of one case where gapless would be convenient -- chunking into groups of 100 rows at a time. But there is a simple workaround using LIMIT 100,1.



Gaps have zero impact on performance. That includes non-numeric indexes. And non-unique indexes. And composite indexes.



Sure, you can run out of ids. I think I have seen it happen twice in nearly 2 decades of using MySQL. I may as well worry about being struck by an asteroid. It's low on my things-that-keep-me-awake-at-night list.



Gaps occur from (at least): INSERT IGNORE, IODKU, REPLACE, DELETE, ROLLBACK (explicit, or due to crash), Multi-master replication (including Galera and Group Replication). Do you really want to come up with workarounds for those?!



Feel free to have us sanity-check anything else that lecturer says that is suspicious.







share|improve this answer














share|improve this answer



share|improve this answer








edited yesterday

























answered 2 days ago









Rick James

39.7k22256




39.7k22256












  • Don’t ask. I feel that her statement is not very correct, but don’t have arguments to argue this why I’m asking.
    – Crypt32
    2 days ago






  • 1




    @Crypt32 - I added a bunch more. I think this is a question that needs to be tackled the other way. The lecturer should provide coherent claims; then we can shoot them down or suggest that those claims do not justify the complexity of reusing ids. I suggest that they need to provide sample code on how to identify and reuse the ids; again so we can point out how messy that will be.
    – Rick James
    2 days ago










  • I tried to ask again during intermission, but all I got was what I said in my question.
    – Crypt32
    2 days ago






  • 1




    Consider that the lecturer's requirement may not have a grounding on real database reasons, but rather, stems from the lecturer wanting to put the students through hoops so that they go through the experience of being forced to tackle difficult algorithms.
    – João Mendes
    yesterday






  • 1




    @Vilx- - That is fine for only one of my half-dozen cases where gaps happen.
    – Rick James
    yesterday


















  • Don’t ask. I feel that her statement is not very correct, but don’t have arguments to argue this why I’m asking.
    – Crypt32
    2 days ago






  • 1




    @Crypt32 - I added a bunch more. I think this is a question that needs to be tackled the other way. The lecturer should provide coherent claims; then we can shoot them down or suggest that those claims do not justify the complexity of reusing ids. I suggest that they need to provide sample code on how to identify and reuse the ids; again so we can point out how messy that will be.
    – Rick James
    2 days ago










  • I tried to ask again during intermission, but all I got was what I said in my question.
    – Crypt32
    2 days ago






  • 1




    Consider that the lecturer's requirement may not have a grounding on real database reasons, but rather, stems from the lecturer wanting to put the students through hoops so that they go through the experience of being forced to tackle difficult algorithms.
    – João Mendes
    yesterday






  • 1




    @Vilx- - That is fine for only one of my half-dozen cases where gaps happen.
    – Rick James
    yesterday
















Don’t ask. I feel that her statement is not very correct, but don’t have arguments to argue this why I’m asking.
– Crypt32
2 days ago




Don’t ask. I feel that her statement is not very correct, but don’t have arguments to argue this why I’m asking.
– Crypt32
2 days ago




1




1




@Crypt32 - I added a bunch more. I think this is a question that needs to be tackled the other way. The lecturer should provide coherent claims; then we can shoot them down or suggest that those claims do not justify the complexity of reusing ids. I suggest that they need to provide sample code on how to identify and reuse the ids; again so we can point out how messy that will be.
– Rick James
2 days ago




@Crypt32 - I added a bunch more. I think this is a question that needs to be tackled the other way. The lecturer should provide coherent claims; then we can shoot them down or suggest that those claims do not justify the complexity of reusing ids. I suggest that they need to provide sample code on how to identify and reuse the ids; again so we can point out how messy that will be.
– Rick James
2 days ago












I tried to ask again during intermission, but all I got was what I said in my question.
– Crypt32
2 days ago




I tried to ask again during intermission, but all I got was what I said in my question.
– Crypt32
2 days ago




1




1




Consider that the lecturer's requirement may not have a grounding on real database reasons, but rather, stems from the lecturer wanting to put the students through hoops so that they go through the experience of being forced to tackle difficult algorithms.
– João Mendes
yesterday




Consider that the lecturer's requirement may not have a grounding on real database reasons, but rather, stems from the lecturer wanting to put the students through hoops so that they go through the experience of being forced to tackle difficult algorithms.
– João Mendes
yesterday




1




1




@Vilx- - That is fine for only one of my half-dozen cases where gaps happen.
– Rick James
yesterday




@Vilx- - That is fine for only one of my half-dozen cases where gaps happen.
– Rick James
yesterday












up vote
6
down vote













Reusing an identity value, should in general be discouraged. Either the value is used entirely internally, in which case it’s actual value is immaterial, or it is also used externally in which case reusing the value is very likely going to lead to a misidentification.



Take the obvious case of an invoice or purchase order number, these might easily come from an identity column and be exposed externally, but you would never want to reuse them for precisely that reason. Both refer to specific transactions which you would not want to get confused.



Resolving such issues can be a big hassle when companies merge or are acquired. Creating such problems on purpose? Not wise.






share|improve this answer





















  • @Crypt32 - The "invoice number" argument may be the most compelling one to throw at your lecturer.
    – Rick James
    yesterday










  • Good example, I didn't think in this way. Bookmarking a link to a resource with its exposed ID would point to a totally different resource after reuse.
    – Crypt32
    yesterday















up vote
6
down vote













Reusing an identity value, should in general be discouraged. Either the value is used entirely internally, in which case it’s actual value is immaterial, or it is also used externally in which case reusing the value is very likely going to lead to a misidentification.



Take the obvious case of an invoice or purchase order number, these might easily come from an identity column and be exposed externally, but you would never want to reuse them for precisely that reason. Both refer to specific transactions which you would not want to get confused.



Resolving such issues can be a big hassle when companies merge or are acquired. Creating such problems on purpose? Not wise.






share|improve this answer





















  • @Crypt32 - The "invoice number" argument may be the most compelling one to throw at your lecturer.
    – Rick James
    yesterday










  • Good example, I didn't think in this way. Bookmarking a link to a resource with its exposed ID would point to a totally different resource after reuse.
    – Crypt32
    yesterday













up vote
6
down vote










up vote
6
down vote









Reusing an identity value, should in general be discouraged. Either the value is used entirely internally, in which case it’s actual value is immaterial, or it is also used externally in which case reusing the value is very likely going to lead to a misidentification.



Take the obvious case of an invoice or purchase order number, these might easily come from an identity column and be exposed externally, but you would never want to reuse them for precisely that reason. Both refer to specific transactions which you would not want to get confused.



Resolving such issues can be a big hassle when companies merge or are acquired. Creating such problems on purpose? Not wise.






share|improve this answer












Reusing an identity value, should in general be discouraged. Either the value is used entirely internally, in which case it’s actual value is immaterial, or it is also used externally in which case reusing the value is very likely going to lead to a misidentification.



Take the obvious case of an invoice or purchase order number, these might easily come from an identity column and be exposed externally, but you would never want to reuse them for precisely that reason. Both refer to specific transactions which you would not want to get confused.



Resolving such issues can be a big hassle when companies merge or are acquired. Creating such problems on purpose? Not wise.







share|improve this answer












share|improve this answer



share|improve this answer










answered 2 days ago









jmoreno

539416




539416












  • @Crypt32 - The "invoice number" argument may be the most compelling one to throw at your lecturer.
    – Rick James
    yesterday










  • Good example, I didn't think in this way. Bookmarking a link to a resource with its exposed ID would point to a totally different resource after reuse.
    – Crypt32
    yesterday


















  • @Crypt32 - The "invoice number" argument may be the most compelling one to throw at your lecturer.
    – Rick James
    yesterday










  • Good example, I didn't think in this way. Bookmarking a link to a resource with its exposed ID would point to a totally different resource after reuse.
    – Crypt32
    yesterday
















@Crypt32 - The "invoice number" argument may be the most compelling one to throw at your lecturer.
– Rick James
yesterday




@Crypt32 - The "invoice number" argument may be the most compelling one to throw at your lecturer.
– Rick James
yesterday












Good example, I didn't think in this way. Bookmarking a link to a resource with its exposed ID would point to a totally different resource after reuse.
– Crypt32
yesterday




Good example, I didn't think in this way. Bookmarking a link to a resource with its exposed ID would point to a totally different resource after reuse.
– Crypt32
yesterday










up vote
4
down vote













The reuse of PK id values has problems and generally should be avoided.



First, the implementation of auto_increment columns doesn't provide the guarantee of being gapless. Indeed gaps will occur if you rollback an insert on an auto increment column.



Secondly the gap ID may refer to existing data that hasn't been deleted (due to missing FK constrains). If they translate to member numbers communicated outside the system then that poses potential business identity risks.



Thirdly, bigint unsigned won't run out of IDs for a significant time even given an extremely large insert rate.



The biggest pain with gaps is coming across auditors who insist its an audit flaw. For DBAs they know gaps exist and why.






share|improve this answer





















  • Interesting sentence about auditors. How they explain gaps as flaw? “Rollback an insert”, you mean when insert fails due to some reasons?
    – Crypt32
    2 days ago






  • 1




    Rollback an insert is either failure or user instigated like: BEGIN; INSERT INTO ... ; {some other insert/update}; ROLLBACK
    – danblack
    2 days ago






  • 1




    Auditors just use similar terms around IDs to DBAs, just fairly different business meanings. Each is logical to their own view on the world. Usually resolvable with good communication.
    – danblack
    2 days ago















up vote
4
down vote













The reuse of PK id values has problems and generally should be avoided.



First, the implementation of auto_increment columns doesn't provide the guarantee of being gapless. Indeed gaps will occur if you rollback an insert on an auto increment column.



Secondly the gap ID may refer to existing data that hasn't been deleted (due to missing FK constrains). If they translate to member numbers communicated outside the system then that poses potential business identity risks.



Thirdly, bigint unsigned won't run out of IDs for a significant time even given an extremely large insert rate.



The biggest pain with gaps is coming across auditors who insist its an audit flaw. For DBAs they know gaps exist and why.






share|improve this answer





















  • Interesting sentence about auditors. How they explain gaps as flaw? “Rollback an insert”, you mean when insert fails due to some reasons?
    – Crypt32
    2 days ago






  • 1




    Rollback an insert is either failure or user instigated like: BEGIN; INSERT INTO ... ; {some other insert/update}; ROLLBACK
    – danblack
    2 days ago






  • 1




    Auditors just use similar terms around IDs to DBAs, just fairly different business meanings. Each is logical to their own view on the world. Usually resolvable with good communication.
    – danblack
    2 days ago













up vote
4
down vote










up vote
4
down vote









The reuse of PK id values has problems and generally should be avoided.



First, the implementation of auto_increment columns doesn't provide the guarantee of being gapless. Indeed gaps will occur if you rollback an insert on an auto increment column.



Secondly the gap ID may refer to existing data that hasn't been deleted (due to missing FK constrains). If they translate to member numbers communicated outside the system then that poses potential business identity risks.



Thirdly, bigint unsigned won't run out of IDs for a significant time even given an extremely large insert rate.



The biggest pain with gaps is coming across auditors who insist its an audit flaw. For DBAs they know gaps exist and why.






share|improve this answer












The reuse of PK id values has problems and generally should be avoided.



First, the implementation of auto_increment columns doesn't provide the guarantee of being gapless. Indeed gaps will occur if you rollback an insert on an auto increment column.



Secondly the gap ID may refer to existing data that hasn't been deleted (due to missing FK constrains). If they translate to member numbers communicated outside the system then that poses potential business identity risks.



Thirdly, bigint unsigned won't run out of IDs for a significant time even given an extremely large insert rate.



The biggest pain with gaps is coming across auditors who insist its an audit flaw. For DBAs they know gaps exist and why.







share|improve this answer












share|improve this answer



share|improve this answer










answered 2 days ago









danblack

1,3961212




1,3961212












  • Interesting sentence about auditors. How they explain gaps as flaw? “Rollback an insert”, you mean when insert fails due to some reasons?
    – Crypt32
    2 days ago






  • 1




    Rollback an insert is either failure or user instigated like: BEGIN; INSERT INTO ... ; {some other insert/update}; ROLLBACK
    – danblack
    2 days ago






  • 1




    Auditors just use similar terms around IDs to DBAs, just fairly different business meanings. Each is logical to their own view on the world. Usually resolvable with good communication.
    – danblack
    2 days ago


















  • Interesting sentence about auditors. How they explain gaps as flaw? “Rollback an insert”, you mean when insert fails due to some reasons?
    – Crypt32
    2 days ago






  • 1




    Rollback an insert is either failure or user instigated like: BEGIN; INSERT INTO ... ; {some other insert/update}; ROLLBACK
    – danblack
    2 days ago






  • 1




    Auditors just use similar terms around IDs to DBAs, just fairly different business meanings. Each is logical to their own view on the world. Usually resolvable with good communication.
    – danblack
    2 days ago
















Interesting sentence about auditors. How they explain gaps as flaw? “Rollback an insert”, you mean when insert fails due to some reasons?
– Crypt32
2 days ago




Interesting sentence about auditors. How they explain gaps as flaw? “Rollback an insert”, you mean when insert fails due to some reasons?
– Crypt32
2 days ago




1




1




Rollback an insert is either failure or user instigated like: BEGIN; INSERT INTO ... ; {some other insert/update}; ROLLBACK
– danblack
2 days ago




Rollback an insert is either failure or user instigated like: BEGIN; INSERT INTO ... ; {some other insert/update}; ROLLBACK
– danblack
2 days ago




1




1




Auditors just use similar terms around IDs to DBAs, just fairly different business meanings. Each is logical to their own view on the world. Usually resolvable with good communication.
– danblack
2 days ago




Auditors just use similar terms around IDs to DBAs, just fairly different business meanings. Each is logical to their own view on the world. Usually resolvable with good communication.
– danblack
2 days ago


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223392%2fidentity-column-re-seed-when-it-is-necessary%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Mont Emei

Province de Neuquén

Journaliste