Root Zone Scaling ICANN Meeting Sydney, Australia Monday 22 June 2009 >>LYMAN CHAPIN: Thank you. I apologize for the late start. We're going to be talking today about the root scaling study that has been commissioned by the three groups that you see up here -- SSAC, RSSAC, and ICANN staff -- to look at issues concerning the effects on the operation of the root server system of -- in part adding new gTLDs but also doing that at the same time that we're doing a number of other things to the root. What we're going to do is I'll give a brief summary of the study, what it's about, where we stand, what the status is. And we're hoping to leave at least the last 30 minutes for Q&A. We may have a little bit more time than that. But give people an opportunity to ask questions. We're also going to go into some detail of the operation of the root server system as it exists today and suggest some areas in which our preliminary investigations we've mostly been doing information gathering from the various root server operators so far. So some of the places in which we think there are interesting issues to be engaged. I should point out that because the study is limited to a very short period of time -- we have to have all of our work done by August 31st -- that one of the results of the study in addition to, you know, the actual output, you know, the work that we will have completed, will be some indication of additional work that needs to be done. Perhaps other studies that need to be conducted. So, if we think about what's happening at the moment compared to the circumstances that have obtained since -- certainly since the beginning of ICANN probably for more than 10 years, over that period of time the root has remained relatively stable. We've done about one new gTLD -- one new TLD a year. The rate of change has been very low. The stability of the root has been almost no activity at all. And we're about to change that. We're going to expand the size of the root by adding new TLDs, IDN TLDs, in new resource records for DNSSEC and so forth. And we're going to increase the complexity of the operation of the root server system. So we're contemplating doing some things at the root level of the DNS that go beyond anything that we've been doing so far. And that sort of makes it fairly obvious that what we ought to be doing is taking a pretty close look at what the effects of doing that are likely to be. And that's precisely what the ICANN board resolution from back in February that started this study was all about. This slide is here primarily for reference. This contains all of the URLs so that, if you're looking at the deck afterwards, you'll have the reference. But it's the same as the previous one. The study team started its work early in May. I put together a team of six people including myself and Bill Manning, who is sitting next to me. Jaap Akkerhuis, Glenn Kowack, Patrik Faltstrom, and Lars-Johan Liman are the other members of the team. The announcement and public comments -- announcement of the public comments page and opening of the public comment period are listed up here as well. And we have to complete all of our work by the end of August. It's partly so that we can feed the results of the work that we're doing now into the third and final draft version of the draft applicant guidebook, as you heard this morning in the new gTLD introduction section. And also so that we can have adequate opportunity for public review and comment prior to the ICANN annual meeting in Seoul at the end of October. The issue that we're dealing with is very simply what will the effect on the root server system be of expanding the size of the root zone and adding support for things like v6 IDNs and DNSSEC. And what we're hoping to do is not so much look at this from a scenario standpoint. What we've been working with in our discussions up until now are primarily anecdotal accounts of what might happen if you did this or what the consequences of changing the value of this variable might be. And we're hoping that to move the basis of the discussion about the effects on the root server system away from a series of anecdotes or disconnected narratives and on to a data analysis driven model that will enable us to demonstrate as clearly as possible what the effect will be of changing this value. So, if you think of the root server system as a set of boxes that have knobs on them as you turn the knobs and change the value of variables in one part of the system, how are those effects going to be felt in other parts of the system? So rather than come up with conclusion that's say things like well, you know, a thousand new TLDs is fine but 10,000 is too many, which is essentially a policy decision because it all has to do with how you deal with or mitigate the consequences of various actions. We're going to produce a model that instead will allow you to look definitively and deterministic as opposed to subjective and anecdotal perspective at what the effects of doing various things are going to be. So we'll have a factual and analytical tool to use rather than one that's primarily anecdotal. We're going to spend a certain amount of time on this slide talking about the way in which the root server operates. And the way in which it behaves as a system. We're thinking of it in terms of, you know, the same way you would think about a biological ecosystem. What are the sorts of things that can happen in the environment in which the root servers have to operate that might affect how they operate and how can you characterize those, how can you quantify them in ways that draw meaningful conclusions from the effects that you can observe when you change the values of different characteristics and their environments. Before I do that, I'm going to ask -- I have sitting up here one of the members of the steering group. If you remember from the first slide, there were three groups that were involved in chartering this study. Each of those three groups contributed four people to a steering group that is overseeing the activities of the study that we're undertaking right now. And one of the representatives of that steering group from the root server system advisory committee, Suzanne Woolf, is sitting on my left. Bill Manning is sitting on my right. He's a member of the study team. We're going to -- as I say, go over this slide in some detail. I want to pause at this point because we're about to make a transition from all of that nice introductory material that I just went through and start talking in more technical detail about the way the root operates to ask if there are any questions about the study process and the way in which we have -- you know, gotten to the point where we're able to put a slide up like this and start talking to it. So I'd be happy to take any questions at this point, maybe just a few minutes, before we launch into this. If there aren't any, that's fine too. Do we have a microphone? John, would you be our mic boy? Please give your name so the scribes can -- >>MARKUS TRAVAILLE: Yes. My name is Markus Travaille from SIDN, the Dutch registry. I was asking why there is nobody from the registry operators or the registries in this study. It's just a question how the team was formed. Maybe you can answer as a chair. >>LYMAN CHAPIN: Sure. The team was assembled using a number of criteria. One of them, though, was that the team had to be fairly small. So we tended to emphasize -- we have two of root server operators on the team. We weren't able to do a, you know, a comprehensive canvass of the space. What we decided to do was, rather than try to put lots of people onto the study team, was to make sure that we very carefully identified all the different groups that we wanted to talk to. So we distinguished between we need a certain number of people on the team to do the work from we need to be sure that we have access to information from all the important sources. Any other process questions for the study itself? No. >>SUZANNE WOOLF: If I could just add a point to that. The terms of reference document, which I believe you showed the URL for, the principal output of the steering group, which was made out of the committees that were original chartered with the work by the board and some members of staff, the terms of reference document does attempt to set out a scope that will allow the team to pull out everyone involved with the first order effects but is not, as Lyman said, trying very hard to keep the effort very focused and specifically. There may -- it wouldn't surprise me at all if there were additional questions raised that are not in scope for this study. One of the work items for the study is to document where the scope starts and what other additional interesting questions come up. So there will be material there expanding the -- what's being looked at. >>LYMAN CHAPIN: I'll just skip forward to this slide temporarily. We do have a public comment area. We are encouraging people to send comments to this email address. But also encouraging in part through having this kind of a session at this ICANN meeting, encouraging people who are -- are in a position to have perspectives on the operation of the root that they'd like to share with us. I'm available all this week. I'm available, you know, by email and so forth after this meeting. Very much encourage anybody who wants to offer a perspective or, you know, provide a viewpoint to get in touch with us. Because we're doing as much outreach as we can trying to, you know, cover as many of the different information sources as we can. But, you know, rather than run the risk of overlooking anything, we'd very much encourage or turn in the other direction as well. So, if we look at the root server system we have today, the way this slide is constructed, it's very basic. It doesn't show a lot of the details in the boxes. But it identifies a number of the functions that have to be performed and a set of six actors that are involved in the performance of those functions. And just to run through very quickly. And we can go through it in more detail. The system in its -- at the highest level can be divided into a provisioning side, which is how do you get the information about the top level domains, the entries in the root zone, how do you get those data to the places where they can be queried? And then the query side or the publication side where the system makes that information available to queries in realtime so that when people look up a TLD, they get a response back from one of the anycast instances of a root server. So that's the fundamental division. And that's that wonderfully artistic curvy line down the middle that divides it into those two basic parts. It's a very simple system at the highest level. It really has just a couple of inputs. The change requests from TLD operators from authorized TLD operators, are the principal source of input to the system. The principal original source of data. The system as a whole is governed by a set of policies that are established by various organizations that have something to say about how the root servers operate. And then on the far right resolvers submit queries and then cache the results. So it's a very simple system viewed from the outside. The innards are a little bit more complicated. The -- and actually, what I'd like to do is hand this over to Bill to go through the details. I think the way I'd like to do this, rather than describe it at a high level and drop down into detail, is pretty much go through the whole thing and let you know what we understand about the system today. Do you want to look at my screen while you're -- >>BILL MANNING: I could. But I'm not sure -- I'm color blind. It was much nicer without the colors. >>LYMAN CHAPIN: No, it's very pretty. It really is. Trust me. >>BILL MANNING: I have implicit faith. So this particular diagram describes, when we're building the model, this is the baseline we're going to build from because this is what happens today. I'm going to start from the -- what I think of as the right side, from the resolvers, and work my way towards that beautiful curvy line. There are 12 root server operators. Those are listed as A through, I guess, M. And some of them do anycast. Those are anycast nodes. The anycast nodes are distributed broadly throughout the global Internet topology. And they are managed and operated, maintained by the 12 -- one of the 12 root server operators. The flow for the root server operations are actually very simple. That small box in the middle that's labeled dm is a distribution master. The root server operators do not have editorial control over the data that we publish. We take the data as authorized and approved by the IANA and going through that cycle and made available on the distribution master. Using normal DNS protocols under normal circumstances, a notify is sent to a list of identified servers in each operation, each root server operator's operations that picks up the update twice a day. And then those updates are propagated using DNS protocols out to the various instances. In today's environment, this is actually relatively simple, easy and accomplishes the task, even over low-speed links because the zone is small. Areas of concern that we've talked with the root server operators in this particular activity are, well, we're comfortable with the terms of reference, the ranges in the terms of reference which call for scaling in both size and rate of change by several orders of magnitude. Root server operators say we can handle those levels of change maybe with a few small changes in the way we do business. The most interesting piece are what happens to the anycast instances that are far away? A discussion I had with someone earlier today asked the question, "If the root zone is a two-gig file" -- right now it's 300-plus entries, a couple of kilobytes in size. "If it goes to a couple of gig and the lowest speed link that I have is a 1200-baud dial- up line, I'm not going to be able to download the root zone to my anycast instance in Los Angeles. I'm going to need to get more bandwidth." So there's some concerns about synchronization as the root zone scales up. As far as the query rates go, most of the root server operators have been targets of denial of service attacks over the 10 years and have provisions accordingly. And we believe as a root server operator community, that we can absorb most of the query load change that we see coming without a tremendous amount of reengineering or change. So our most significant set of questions and concerns has to do with synchronization among all of the instances that are out there. As the root zone scales up in size or rate of change. The distribution master itself, that cluster is maintained by VeriSign. And that's our interface to the publication or provisioning side of the house. And are there any questions about the root server operator side that you want to raise at this point? >>LYMAN CHAPIN: Publication side. >>BILL MANNING: Publication side, mm-hmm. Are people awake after lunch? [Laughter] >>LYMAN CHAPIN: They're all experts on the way this works. They are now, thanks to you. John. >>MARKUS TRAVAILLE: Just a question from curiosity. You said that you were able to manage the queries that will come from the new gTLDs and all the DNSSEC and whatever. Do you have an idea how many more queries there will be? Do you have any estimates, any scenarios for that, even if the infrastructure is ready to accommodate for this? >>BILL MANNING: No. We have no idea what the change in the query rate will be. However, most public information about the query rates say that root servers -- any instance of a root server is roughly handling between 8- and 15,000 queries a second. And with modern hardware and modern software, most of the server instances can handle upwards of several hundred thousand queries per second. And most of the time that provisioning has been done to handle denial of service attacks. As the actual query rates grow, we don't expect them to grow instantaneously. We expect them to actually show a growth curve. And, given the hysteresis in the system, we believe we will have some time to react to add additional capacity as needed. But right now we don't see a sea change in the way things are happening. We're going to see small -- we believe there will be small gradual changes that can be foreseen and projected and planned for as opposed to a -- what is the right word I'm looking for here? An unexpected change. We do not expect that the provisioning side is going to make a decision that we are unaware of and throw something over the wall that's unexpected. We have pretty good relationships with the provisioning side on that. So we've got another one over here. >>SUZANNE WOOLF: Before you move on, just to add a little bit, one word to what Bill had to say. The other thing to keep in mind about how the system responds to changing the size of the root zone is that how that impacts the number of queries we're seeing is not a direct relationship. It depends a lot on how those new TLDs are used by whom and with what patterns of activity. So it's not a simple thing to model. And there's a great deal we're going to have to learn from experience. >>LYMAN CHAPIN: It's also worth pointing out that the goal of what we're trying to do in this study is not so much to anticipate a particular kind of change in the query load but to show what the effect of an increase in query load will be. So we'll collect the data and do the analysis necessary to show that, if the query load doubles, triples, order of magnitude, et cetera, here's what the effects will be. And here's how the system as a whole can be expected to react to that change. That's a different kind of question and a different kind of study objective than looking at the system and trying to anticipate what the consequence will be of a particular kind of increase. So, you know, where will things break, for example? If there's an increase in the query load, a response could be just increase the bandwidth to anycast server instances and increase the size of the servers and so forth. We're going to show how those effects are felt without trying to make the policy decision of -- well, that's too much, or that's unacceptable or things will break if -- there's any number of things that can be done. And the way in which decisions get made about how to mitigate those effects is separate from the effects themselves. Jordyn, go ahead. >>JORDYN BUCHANAN: I'm Jordyn Buchanan, for the record. Am I not on? >>LYMAN CHAPIN: Now you're Jordyn Buchanan. You weren't Jordyn Buchanan for the record. >>JORDYN BUCHANAN: I am, in fact, Jordyn Buchanan. I have a couple questions about the anycast nodes. The first is the publication of zone data from wherever receives the data from the distribution manager to the anycast nodes, is that all done using DNS protocols as well or is some of that out of band? >>BILL MANNING: To my understanding, having talked with most of the root server operators that do anycast, it's done with DNS protocols. As the primary preferred method. There are out-of-band alternative techniques not using DNS protocols which are there and are occasionally exercised as contingency plans. We have not yet exercised the carrier pidgeon delivery mechanism. >>JORDYN BUCHANAN: Do we do only full zone updates? Is that why we're concerned about the size of the zone, or is there incremental update as well? >>LYMAN CHAPIN: The current methodology is to do an entire because a 300-entry zone doesn't make sense to do incremental updates >>SUZANNE WOOLF: Because why not. It's simple. It works. No reason not to do it that way. >>BILL MANNING: It will fit on a floppy disk, if you remember what those are. >>JORDYN BUCHANAN: Fair enough. So my other question is regarding the importance of the anycast nodes. Are they there for latency, capacity, both? >>BILL MANNING: Yes. Latency and capacity, as things scale up, those tradeoffs become different. >>JORDYN BUCHANAN: Fair enough. Has any -- assuming we kept the status quo in place, has any -- have you guys looked or has anyone looked at, assuming we start to lose nodes because there's low bandwidth links there, what the effect on capacity and latency would be, absent scaling up those links? >>BILL MANNING: My understanding is a couple of anycast studies have been proposed to examine those questions and have yet to receive adequate funding. >>JORDYN BUCHANAN: Thanks. >>BILL MANNING: Got one more over here. >>JAMES SENG: Hi, I'm James Seng, for the record. I have a question. Based on historical data, do you see the increase of query increases with the number of TLDs or the number of users online? >>BILL MANNING: Well, the queries don't come from the TLDs. The queries come from the end users and the applications that the end users use and the applications and the content that are housed inside the particular TLDs. So the -- perhaps -- if I may be so bold. >>JAMES SENG: Let me clarify. I'm trying to draw a correlation between the number of correlation rates. What it's correlated to? Is it the number of TLDs, the total number of DNS registration or the total numbers of users online? >>BILL MANNING: If you're going to draw that correlation, I suspect it would be the total number of applications online, not users. But I wouldn't draw that correlation. >>LYMAN CHAPIN: Kind of a simplistic way to put it is that the number of queries is related to the number of queriers and not to the number of things being queried. It's a little bit awkward to put it. But, you know, you can sort of think of it that way. But, again, one of the reasons we're collecting data from, you know, root server operators and others is precisely to be able to put together a picture of how this is likely to work. So I want to -- I say that only because I want to caution people that, you know, the things -- the -- a lot of the answers that we're going to be giving to questions are based on preliminary investigation, and we haven't completed all the information gathering yet. >>JAMES SENG: The reason I ask that is because I'm just trying to get to what leads to the increase of queries on the root server which, obviously, is going to be a problem at least in a scaling problem. Is it the number of registrations, is it number of people, users online and you point out it's the number of queries that potentially increase? So what causes the number of increase of queries? Based on historical -- >>SUZANNE WOOLF: It's really hard to give a simple answer because you're asking is it one of these things or another or another. The fact is it's all of those. It's how many TLDs, how many second levels under those TLDs, how many queriers, how many users? It's also the interaction of those things in the sense that, if you have a small number of very popular new TLDs, for instance, one of the characteristics of the system is that it caches, not every interesting answer, not every interesting query goes all the way up to the root because if there's a great deal of activity around one TLD, further down the tree, you're not going to wind up with queries to the root each though you have more users looking for the same top-level domain. So it's the interaction of all of those things and that's one of those things we cannot predict. >>JAMES SENG: So just a comment. Maybe it would be interesting to see a study of the historical record of how the traffic of the root server grew over the years. And, if there's a jump significantly, what caused the jump? And it may be one of the multiple reasons that you cited. >>BILL MANNING: So, to provide you with one of these anecdotal answers there was a jump in the query rate for the domain local host after certain vendors put things in their applications for enterprise use that escaped out into the public Internet. It's, from my perspective, it's the applications that drive the queries. And how -- what the expectations are there. So -- I -- I encourage you to register dot local -- or local host as a TLD. [Laughter] >>SUZANNE WOOLF: Yeah, good luck with that. Yeah. The fact is that the vast majority of the queries seen at the root zone are for -- are for no such domain or for queries that where the ultimate answer is no, in one form or other. Because positive answers tend to be cached and referred to further down the tree. >>JOHN CRAIN: And there's a fairly good study of traffic at the root. >>BILL MANNING: Who are you? John, who are you? >>JOHN CRAIN: John Crain. ICANN and sometimes operator of "L" root. So there's a fairly good study of traffic to the root service that catered it. I've been doing it for a couple of years. I'll see if I can find that URL and get that to you. It's pretty interesting reading. >>BILL MANNING: So that's the root server side. >>LYMAN CHAPIN: Ready to move across the line? >>BILL MANNING: It's no man's land, I don't want to go over there. [ Laughter ] >>LYMAN CHAPIN: Okay, on the provisioning side. >>BILL MANNING: Wait, we've got at least one more question. >>LYMAN CHAPIN: Oh. >>ROY ARENDS: Thank you. Have you taken -- >>LYMAN CHAPIN: For the record -- >>SUZANNE WOOLF: Who are you? >>ROY ARENDS: Sorry, Roy Arends. Have you taken into consideration the dependency graphs? Basically if you add, for instance, more TLDs to the root zone you can create larger and deeper dependencies, for example, de leading on fr, fr leading on dot net, if you know what I mean, these are basically (inaudible) records on the right-hand side of dependencies on other TLDs and if you add more top-level domains and you add more in different records you can add -- sorry, you can create very, very large and very interesting and even beautiful dependency graphs which will indefinitely leads to a higher amount of queries. It will also include things like looping, mutual dependency, mutual exclusive dependencies, circular dependencies and whatnot. >>BILL MANNING: So from the root-server perspective, we don't get involved in editorial decisions we publish the data that we receive on the presumption that the data is accurate and we get it intact from the provisioning side of the house. We're not going to check for these kinds of beautiful Byzantine dependencies which are really gorgeous in their sort of convoluted way, but that's something that should be taken up as part of the provisioning questions. >>ROY ARENDS: I'm not going to ask the same question again on the provisioning side. My question really was during this study was this study taken into account? >>BILL MANNING: Yes. >>ROY ARENDS: Okay, thank you. >>LYMAN CHAPIN: Okay, on the provisioning side this is the side of the system that gets the data to the point in the distribution master where it can be pulled by the root server operators and then published and made available to queries. And again, from a block diagram standpoint it's a relatively simple system. The inputs to this system are the change requests that originate with TLD operators and those can be, you know, things like adding a new entry to the root or changing an existing entry but there are requests to change some piece of information that's contained within the root zone. Those queries today are received today by IANA in e-mail messages and IANA has to go through the process of validating the authenticity and authority as the source of the request as it comes in as an e-mail message. They're in the process of moving to a more automated system that will provide, again, a more automated way for checking for the validity or authenticity of requests to make changes to the root. At any rate, the first step, obviously, is for IANA to, you know, to go through an administrative check that makes sure that the change that's being requested is a valid change being made by a valid authority that is able to make that kind of change request. The requested change is, then authorized by NTIA, Department of Commerce, and the authorized request is, then, sent to VeriSign which processes it and actually updates the root zone file, makes it available in the four distribution masters and sends out the notify message that lets the root server operators know that a change to the root zone file has been made and they can pull it from one of the masters. The process -- what we're trying to do in this study is to understand what each of these actors -- and each actor is represented by a different color -- what each of the actors does in such a way that we can quantify the system, quantify what each actor does and then be able to show what the effect of changes will be. So if you get, you know, many, many more change requests per unit time, how will that show up as changes to the rate at which the IANA process, the DOC process and the VeriSign process can be carried out. And again, the point is not to try to ask what-if questions at this point, but to build a model that would enable you to ask what-if questions. So if we understand, you know, what the processing time through a particular function is, then we can express that as an equation or however we want to do it and it will be relatively straightforward to put together, step by step, all the way through the provisioning side what the effect of a larger root or more frequent changes to the root or, you know, any of the other things that can -- that can be expressed as variables. What effect that will have on the processing way through the system and what the resource requirements will be and so forth. And the idea, then will be that anybody trying to make decisions about what -- you know, whether or not to pursue a particular policy direction will have available a model that will show them what the consequences of each of those policy decisions is likely to be. So that took even less time to explain the provisioning side than the query side. >>BILL MANNING: That's true. >>LYMAN CHAPIN: Yeah. Any questions about the way that part of the system operates? Yeah? >>ROY ARENDS: And I think I know the answer but I just want to have this confirmed. Communication between IANA, VeriSign and Department of Commerce, how is it done, is it a phone call, is it an e-mail, is the e- mail authenticated, is it by fax, maybe? >>LYMAN CHAPIN: I believe that most of the exchanges are e-mail exchanges. >>BARBARA ROSEMAN: Barbara Roseman, I'm the IANA GM. Right now it's done, as it has been historically, through e-mail verification. We -- it's fairly insecure in one sense, but it's actually built up over a trust relationship that we have with many of the TLD operators. And so the basic premises that anyone can submit a change request and we will go to the confirmed contacts to validate the request. So there is always a step where the contacts who are listed for the domain are the ones that have to validate any request that comes in. >>ROY ARENDS: Thank you. >>DOUG MAUGHN: Doug Maughn, DHS. Just out of curiosity, Bill and Lyman, as part of the study, did you look at the impact of DNSSEC as it starts to roll out as well? >>LYMAN CHAPIN: Yes, that's very explicitly part of the terms of reference for the study. And we'll be looking at the intersection of the effects of increasing the side of the root by adding new gTLDs and also adding the various policies and processes involved and signing the root and signing the zones that are contained that are delegations within the root, so very much so, yes. >>BILL MANNING: Doug, as I mentioned earlier, this is the baseline of what's done today. Since DNSSEC is a new activity, it's not been completely defined how that's going to be integrated into the existing process. That we'll hear more about that later on this week and we're working with those parties to try and make sure that our model comprehends those proposed changes and those proposed flows, but it's like I say, this is the baseline. >>LYMAN CHAPIN: If someone knows off the top of their head, there is a fairly lengthy DNSSEC symposium that's scheduled for -- is it Wednesday, tomorrow. >> Wednesday at 9:00 a.m. >>LYMAN CHAPIN: Wednesday at 9:00 a.m, right. And many of the same, you know, many of the issues that we're concerned with in this study are also issues that the folks looking at DNSSEC are concerned with. Other questions? >>SABINE DOLDERER: Sabine Dolderer. I have a question, you're saying you're currently working very much on the current policies in the current system and how this will be -- actually result in when the root gets bigger and bigger. Have you also had a look on the e-IANA system? We are pushing since years a more automatic IANA system and I know that IANA's working on that to automate the request from the -- for TLDs and nameserver changes and to make it more smoothly and more reliable and more quicker. Do you also include that in your study? >>BILL MANNING: When -- >>BARBARA ROSEMAN: I was going to say that, yes, this has been added to the study in terms of the basic data that IANA is providing to the team as far as processing times and throughput. And so, you know, we're giving them data based on the current process based on how we anticipate the automated process is going to alter the dynamics somewhat and we expect that that will change some of the knobs, you know, as Lyman has described, but -- but the amount of change is not going to be hugely significant given that IANA has improved its processing so much already and so, you know, I think we do expect to see some changes in throughput but not -- but we expect to see much more of an impact as new gTLDs are added or the increases in adding v6 records, AAAA records, or other changes that are anticipated that increase the frequency of change more than the amount of time it takes for the provisioning side to process a request. >>BILL MANNING: If I can add to that. Just a little bit. If you look at that interaction between the actors on the provisioning side, the actor with the largest upfront loading to validate and verify accurate data is being presented and put into the system is the IANA. And the e-IANA task is an attempt, as I understand it, to improve the accuracy of the data before it gets into the system so they don't have to spend a lot of time checking and rechecking because they will have done the validity checking before it enters the system. So once it enters the system, it can be much more automated and smoothly pushed through. But right now the IANA still is sort of the gatekeeper to the integrity of the data that gets into the system. And as far as I'm concerned, they can take a month to make sure it's right. Or longer. Because I don't want bad -- once bad data gets into this, it's really hard to get out. So having good sanity checking upfront really helps out a lot and I think that the e-IANA portion, that data has been given to us and we're putting that into that. I don't think it's going to change the way the actors interact with each other much. But it will automate a significantly a lot of that background work. Did I get that right, Barbara? >>BARBARA ROSEMAN: Yes, I think that's (inaudible). >>BILL MANNING: Uh-oh. There's a whole row. Sabine, you're going to have to wait. >>JAY DALEY: Jay Daley. It's a comment. I've done root zone changes several times, lots of times, and to my surprise, I've got it wrong a few times. I didn't think that was possible but I have and IANA have very nicely pointed out that I've got it wrong and I fixed it and so I just actually wanted to say that that level of manual checking works very well if even somebody like me can get it wrong, okay? [ Laughter ] >>BILL MANNING: Pay her the 20 bucks now. >>SUZANNE WOOLF: That's actually a part of the concern is making sure IANA can continue to provide that level of service and assurance as perhaps the transaction volume gets higher. >>LYMAN CHAPIN: It's worth pointing out -- Sabine, I'm sorry, just before you go -- it's worth pointing out that one of the specific goals that was called out in the terms of reference was to maintain a zero- error publication of data in the root zone. And up until now it -- it's probably not completely accurate to say that we have operated a zero-error system but it's close enough that -- let's just say that it's extraordinarily close to a zero-error system. As one of the sort of obvious just information theoretic effects that you would expect to see in increasing the size of the root zone and increasing the frequency with which you make changes to the root zone is an almost mathematical difficulty in maintaining a zero-error operation. So one of the things that we're considering in this study is, you know, what -- what that requirement, what that TOR requirement is going to mean as we look at a significantly increased root zone file, file size. >>SABINE DOLDERER: But I still want to -- I understand that I don't want -- of course, I don't want to have wrong data in the root and I'm still -- I think it's important that the checks are very early, in very secure and a very sophisticated manner and even manually but on the other hand side, I think from an operational standpoint sometimes you need also a -- let's say a guaranteed time -- time in which something happens when there is no error and I think that is also a goal which we should try to achieve. And frankly, I have to say I usually don't have -- I can't plan that the IANA action usually needs one month which it does usually, but I think we shouldn't take that into -- as an option. >>BILL MANNING: So what kind of numbers do you want to see? >>SABINE DOLDERER: It's -- basically, let me -- that's an example, I usually try to highlight. We still use -- we still use within our setup some unique nameservers which we maintain because we have -- we want to have a reachability with our different service providers and not only within our anycast server. But if something happens to that anycast server, I want to change them and not maintain them in the root until -- if there is a problem with them. >>BILL MANNING: So -- >>SABINE DOLDERER: So my perspective would be at least within one day. >>BILL MANNING: So a 24-hour -- >>SABINE DOLDERER: Yeah, and also on weekends. >>BILL MANNING: Service window should be fine? >>SABINE DOLDERER: Yeah, but also on weekends and on Christmas. >>BILL MANNING: 24 hours, pick whatever 24 you like. >>BARBARA ROSEMAN: Yeah, I'm not going to get too much in different people's experience with IANA because I think there are different issues that complicate each request and that's one of the things that we've experienced in IANA is that each request turns out to be quite unique regardless of how you intend it to be in terms of straightforwardness. I will say that right now there are three parties involved in putting a request through the process and so even at the most optimum type of processing, even with the most automation for the steps that can be automated, there are still several review steps and so it does take longer than one day to process a request end-to-end unless there is some urgency to it. We have, in fact, done requests within a 24-hour period when it was, you know, made clear that a TLD was either going to be offline entirely or offline within their country or, you know, those types of things. And so it's not that the system isn't capable of being ramped up to that on an as-needed basis, it's just that in the normal course of things, that's really not the preferred method, it doesn't allow for the proper review of the request as it comes in. So that's -- that's been our attitude towards this up till now. >>LYMAN CHAPIN: In general, one of the things that we've found and this is something that applies broadly to the new gTLD program as a whole is there's a big difference between functional parts of a process or functions that have to be performed that involve human inspection or human participation and decision-making versus things that can be automated and human-mediated processes, in order to model them successfully, the way in which you characterize their response to changes in inputs tends to be very different from the way in which you can model automated processes, where you have a, you already have a much more deterministic baseline to work with. So in the course of constructing the model and doing the study, we're being very sensitive to the potential, in other words, we're trying to work with the system as it exists today. But we're trying to, as much as possible, anticipate changes that will fundamentally, in some qualitative way, change how we would model the system. So changing from a manual or human-mediated process for something that IANA does today to one that is more highly automated is something that even if it's -- even if it's something that's going to happen somewhat in the future, we're going to take into account because it would be foolish not to. Because so many of the premises of the model that we would build would be undermined or invalidated by that kind of qualitative change so we're trying to take those into account as much as possible. In some cases we're discovering new places where it might be appropriate to do that. >>BERTRAND DE LA CHAPELLE: Hi, my name is Bertrand de la Chapelle, I'm the French representative in the GAC. First of all, I was very interested in the detailed term of reference for the study which were, obviously, for nontechnical people and policy guys like me, was a very good highlight of the different dimensions, and in particular, the compounding effect of the possibilities of not only increasing the number of TLDs, but also the number of queries, the number of actualizations and modifications. So it's a very important element. And thank you just for the term of reference. It's already interesting. The second thing is I understand from what Barbara Roseman was saying that although there are some elements that are -- involve manual interaction, that do not involve manual interaction can be automated many requests remain unique which shows that there's a whole likelihood little economies of scale expected as the number of TLD grows. Am I right in assuming that or not? Meaning that as the number of TLD will grow, we do expect in any case that there will be a load that will grow both in IANA and on the system of the root servers and the last question is: I've been very surprised at the reemergence of this issue at this very late stage in the gTLD process. The reason why I've been very surprised is because the fundamental assumption of the whole gTLD process is unlimited domain name space. You don't do this kind of allocation if you don't presume that the space is unlimited. Because if you presume that there is scarcity or limits that are due to scale then the whole allocation method needs to do priorities, different categories and sewn. My concern is the following: Knowing the period that you will need to do the study although it is a relatively short time, what if, in the end, the result is there is a scalability problem? Maybe not in the long-term, because we can ramp up the capacity, but maybe in a relatively short-term, if there were an outburst of applications that were coming immediately, what if the result is this and this actually puts into question the whole assumption upon which the three-year process now is based upon? How do you feel about this and how can you reconcile this with the process? >>LYMAN CHAPIN: Yeah. It's a good question. It's one that I think is probably occurred to all of us. It certainly occurred to those of us who are conducting the study who felt that, you know, although it was definitely a good idea to conduct the study, there was some, you know, wondering why it hadn't been commissioned earlier and so forth. But more important than that is, I think, to look at the fact that what we're trying to do is not uncover, you know, some, you know, terrible consequence that is going to invalidate the -- you know, the whole process. But to, as clearly as possible, look at what the consequences are going to be. So when you decide that you're going to do something, you're not going to suddenly decide not to do it just because it's going to have a consequence that you have to deal with. You're going to say, okay, how do I deal with that consequence now? And I see the study that we're conducting now and the reason that it was commissioned as a desire to understand the consequences so that ICANN and the other actors, the other people involved, can take appropriate actions to deal with them. Now, it is possible, at least in theory, that we might come up with a potential consequence that is so extreme or so unanticipated or so surprising that it completely in- -- or calls into question the whole process. We certainly haven't found anything like that -- yet. But even if that were the case, you would still have to say, okay, maybe it's too bad that we waited so long. But just because we waited so long, you wouldn't want to, then, say we've waited so long, we're afraid of what we might find so we shouldn't look so I understand your concern and I've had some of the same thoughts but I'm very glad we've been asked to do the study and I think that the best way to look at it is that we're trying to put everything up in front so that we know what we have to deal with, we know what the risks and the effects are going to be and we can, as a community, we can take the appropriate steps to be sure that whatever mitigations we need to apply we apply them sooner rather than later. >>BERTRAND DE LA CHAPELLE: If I understand correctly your attention is to map the quantitative consequences. >>LYMAN CHAPIN: Absolutely. >>BERTRAND DE LA CHAPELLE: And you don't expect any qualitative threshold that would make one zone of size -- number of TLDs significantly different from a lower number, like there is not, in your perspective, any threshold that seems to appear that would say, oh, well, fundamentally until, I don't know, 100, 10,000, 1 million, it functions basically in a scalable mode, but beyond that, it becomes very difficult. You don't expect a discontinuity in the -- >>LYMAN CHAPIN: Well, the way in which a quantitative change, going from one order of magnitude -- becomes a qualitative change is a policy decision. So if we discover that there is a relationship, for instance, between the size of the root zone and the cost of maintaining a network of anycast servers and we can show almost an equation as the number of entries in the root zone goes up, the cost of maintaining an anycast network goes up. Somebody has to decide that they're going to pay that additional cost. And as a matter of policy, someone may decide that when that cost gets to a certain point, they have to change the way they do budget justification, they have to come up with a different business model, okay, that may or may not represent the same thing as saying, as the size of the root grows, this is too many. Okay? That's a consequence of growing the root zone is the cost of maintaining the anycast web goes up. It's very difficult to say at what point does that become, from a policy perspective, too expensive, unsustainably expensive, you know, so we're trying to deliberately stay away from that, that's where you get in the realm of and he can date where somebody says if you grow by three orders of magnitude it would be simply impossible to maintain any anycast servers in parts of the world that don't have good connectivity. It's very difficult to have a good concrete discussion about what to do to -- you know, to accommodate changes to the root server system when you're dealing with different people's perception of what is or isn't too much. Okay, somebody who is which will to spend the money can afford to put bigger anycast servers out there and if it becomes important enough they'll put extra band width out there and those are all things that might be done and we could have a debate about whether people would be willing to do it or how to do it but those are policy questions, not technical questions. >>BILL MANNING: Can I try and answer your second question since he answered your third one? >>LYMAN CHAPIN: I like to start at the back and work in. >>BILL MANNING: That's what I was doing, start at the right and work left. The question you were talking about was dealing with the current difficulty that IANA has with the uniqueness of request. In many aspects, we are dealing with an Internet that is almost brand new. It's hard to think of it that way but, in fact, that's really what happens. If we adopt your premise that the number of potential new TLDs is unbounded, to a large degree the existing 300-plus were effectively handcrafted by artisans. And when you start to mass market you can't expect handcrafting by artisans to continue on and so there will be a regularization as the process improves. This is not expected or anticipated to be a, you know, the outcome is not going -- expected to be here is the new way. It is here is the way forward and we will continue to evolve the processes and procedures to accommodate the demand. And that, again, plays to what Lyman was talking about which is there are costs involved in doing these things and people will figure out what to do to make those things happen to continue this if it's important. And they will -- it will end up being self-limiting at some point but I think that it's -- looking forward, I expect that there will be more automation and more regularization of the data and what's expected before it gets into the root zone processing system. And so a lot of those kinds of concerns, I think, will go away over time. >>LYMAN CHAPIN: One of the most important things that we're doing or, you know, we haven't done it yet but we're working on it, is to -- by -- again, by gathering information, to find out what kind of system the root server system is. So that we have good ways of thinking about how likely it is to respond to certain events. Some systems, for example, respond very smoothly and linearly to changes in their inputs. That's just the kind of system that they are. Other systems, other systems respond to changes in ways that can exhibit very sudden and dramatic changes when an input variable reaches a certain threshold. So in chemistry, if you have a well-buffered solution, you can keep adding acid to it and eventually reach the threshold. And, bang, the next time you add a little titration of acid, all of a sudden the pH goes haywire. Okay? It would be useful to know what kind of system the root server system is. How does it respond to changes in the circumstances in which it's being used. If we understand that, we'll be in a much better position both to know what the consequences of any policy decisions we make are likely to be and also to understand much better how to have the discussions about those policy decisions so that everybody has a clear understanding. Here's how the system works. Here's what the effects are going to be. Let's all understand that. And then we can have a meaningful discussion about how we're going to make policy decisions within that space. Because right now most of our discussions are taking place, again, on the basis of a lot of anecdotes. And I -- someone did once say that the plural of anecdote is data. But in this case I think we'd like to distinguish a little more clearly between the two. >>JIM BASKIN: I think we're getting close to the end here. But my name is Jim Baskin. I'm with Verizon Communications. When I came in here, I thought root scaling was dentistry, 401. >>SUZANNE WOOLF: Some days it hurts less. >>JIM BASKIN: But talking about root server scaling, it's been mentioned there are things like IPv6, DNSSEC, there are going to be separate discussions on that. But those are things that are going to go into the mix at the same time as all these other changes and the increase in TLDs, whatnot, all that is being considered in your study or -- they're not being considered separately? It's all part of the grand -- >>BILL MANNING: Did we miss anything? >>JIM BASKIN: I'm not an expert on that. >>LYMAN CHAPIN: Very specific part of your question. We are considering all of those things as they relate to each other in the context of their effect on the root server system. There are lots of DNSSEC issues and v6 issues and IDN issues that are very important but are not part of our study. And I want to emphasize that, because this is one study of a particular set of issues concerning how the root server system reacts to this. It's not going to answer all the questions that are on -- you know, that are legitimately on people's minds about DNSSEC or v6 or IDNs. >>JIM BASKIN: Thank you. I understood that. I just wanted to make sure that in your specific root server scaling, you were taking all of those items into account to the effect that they impact root servers. >>LYMAN CHAPIN: Very much so. Thanks. Any other questions? Yeah. >>JAMES SENG: Just one observation that I thought you were going to go into, but obviously not. There is actually a lot of operations that's going on within each root server operators especially on anycast. And how would that scaling affect the current operation of each individual operator? >>BILL MANNING: So each individual root server operator actually with one exception, every root server rate operator has multiple nodes deployed in the provisioning of root service. Sometimes it's behind a load balancer. Sometimes it's geographically distributed. So, from an anycast solution, everybody is doing something with multiple nodes. Everyone is aware of potential changes in the increase in size, absolute size, as well as the potential for increasing the rate of change. And, quite frankly, most of us are worried about the rates of change. We believe there's sufficient hysteresis in the system to make sure that the rates of change from the provisioning side aren't going to eat our lunch. And so we can propagate out to the anycast nodes that we have. Things that might be problematic is, if there is a decision to do a step function instead of a couple of changes per day, we go to a million changes a day. This might be problematic for distributing to the anycast nodes and would cause us to rethink how we distribute. Whether that's using a full zone transfer or an incremental transfer. Do we request a change from twice a day distributions to every 30 seconds? You know, those are the kinds of things that I think concern most of the root server operators who are doing anycast is the rate of change and coherency. So we're concerned, but we expect that it's not going to be a surprise. And we will have sufficient lead time in which to accommodate the change. >>SUZANNE WOOLF: We are now running over, I think. >>LYMAN CHAPIN: Okay, we'll take -- >>BERTRAND DE LA CHAPELLE: Just last question. I would like very much. Sorry. Bertrand de la Chapelle. Like very much the formulation, what kind of system, the root server system is linear reaction or not. For, once again, people who are outside of the technical environment, one of the arguments that is often used to say you can have an unlimited number of TLDs is that, basically, it's not that different from the management of a very big registry like dot com that has 80 million domain names. Can you very briefly or point to something that can explain to a relatively layperson, the major difference between the two systems and in particular whether in terms of the rates of updates and the number of distribution of replication of the file, a dot com manager does, how applicable is the analogy or not? Because it's an analogy that we hear a lot in the policy environment. >>LYMAN CHAPIN: Yeah, that's a good point. Yeah. The short answer is that it isn't possible to do that very quickly. It is certainly possible to do it. And you know, many people have -- well, Steve, did you want to answer the question? Or -- because I'm not going to let you ask another question. I'll let you answer this one, though. >>STEVE CROCKER: So, I had thought that I was not going to say a word in here, but I'll jump in. Thank you for that question. And it comes up frequently, standard thing. There are unspoken things, things people just don't talk about and for which data is not so readily available. There is an enormous amount of care given to changes in any of the aspects of the root zone. When a change request comes in, it is looked at very closely. And the intent, which is not achieved 100% but pretty close, is zero errors in that process. If you take the question that you've asked and you say, well, com is 80 million and what's the problem and then you ask the same question, what error rate in the changes that are made, how often is something not quite right about a change? Well, the data is not so easily available. But I would have to believe that that error rate would not be a tolerable error rate. So there's a qualitative difference in that operation. >>BILL MANNING: I think Sabine wishes to rebut. Contribute. Contribute. >>SABINE DOLDERER: Just for the sake of the argument, I think we are currently doing roughly 3,000 to 5,000 changes a day. And I presume that we don't make a lot of errors on that. Actually, from my perspective, I know from nearly no error, which happened, where a change request came in to DENIC. And they asked distributed otherwise in our nameserver. And a lot of our registrars and officers would be strongly disagree on that. So I think, of course, you have to be sure that the system is working you're relying on. And I would expect for dot com similar. Of course, if you automate a system, you have to prove that you do it reliably and you do it safely and you do it accurate. But I think it's possible. >>CALVIN BROWNE: Sorry, Sabine. Calvin Browne here. The answer isn't -- sorry, the question isn't how many errors DENIC has made. The question is how many errors have been in the DENIC system. So that includes your registrants have made incorrect requests and so forth. And, if you look at the thing as a whole, you want to look at that error rate, which is different from the number of errors that DENIC has made. >>SABINE DOLDERER: So you think that's my job? >>CALVIN BROWNE: Not at all. >>SABINE DOLDERER: (Speaker off microphone.) >>LYMAN CHAPIN: Pass the mic around. The mic is the token. >>DAVE PISCITELLO: I'm Dave Piscitello, ICANN. I think the difference and the huge value add that IANA provides is that in the typical registrar scenario where someone goes and registers example.com, they can make any number of errors that IANA would typically check and, you know, assure doesn't get into the root. So, for example, if I go and I register example.com and I create a name server record through my registration account, that's an incorrect nameserver record. It's going to go in. It's going to point to a bad IP address. The only person who really suffers from that is me because my zone file is not at the IP address. That's exactly the kind of error that IANA has to spend a fair amount of time correcting. And certainly DE and any of the other registries are not going to be obliged to do that. But it's the cost of having that zero error that -- you know, in the root that we truly want to invest in and make certain that we don't corrupt. We have a tolerance for error that is not the obligation of the registries to correct in any part of the system below the root. >>JOHN CRAIN: Before I pass it to Sabine, that is, once again, a policy issue. So, you know, the answer is, if you want 80 million TLDs, this is the effect that's going to happen. And you're going to have errors, and that's a policy question at the moment. The answer to that is we'd rather not, but that may change. >>LYMAN CHAPIN: One more. >>JORDYN BUCHANAN: May I make a point on that exact topic. It seems crazy to me that we're talking about making error rates and being close to zero. You guys are talking about magnitude and acting like it's the same thing about rate. But, in reality, if one error has happened in the past, right, your error rate is not zero. You're maybe 99.9% accurate or 99.9% accurate or something like that. And saying we're going to vastly increase the number of TLDs in the root zone and expect the magnitude to stay the same is actually a vast improvement in your error rate, right? So I think it's perfectly reasonable for the magnitude to increase as the number of domains in the zone increases as long as the rate remains consistent. So it seems perfectly reasonable to figure out what the actual sort of historical rate is and try to keep it at that level as opposed to trying to keep the magnitude identical. >>LYMAN CHAPIN: Although, intuitively, that sounds like a reasonable argument. When you apply it to the root zone, I don't think it actually applies. It may be -- it may very well be the case that in the root -- in the root zone, a vanishingly small absolute number of errors is the requirement, not a small number of errors relative to the size of the zone. There may not, in fact, be a nice linear scaling relationship between the acceptable error rate and -- at any rate, go ahead. >>BARBARA ROSEMAN: We're getting past 5:30. I just want to remind people. This is Barbara Roseman again. In answer to a couple issues raised here around error rates, I think there is some confusion about what error is and whose responsibility it becomes to ensure that there is zero error. And, you know, our goal in IANA is to ensure that nothing enters the root zone that hasn't been verified and validated. However, I will say that, you know, we have had a couple occasions where data has gone in that either had an unintended effect or was simply incorrect by the time it was published. And there's another factor there that has to do with the fact that, you know, nameservers don't stay up 100% of the time so by the time something gets into the root zone it may have had its own issues with error. I agree with Lyman that the goal here is to keep the errors to a minimum regardless of the size of the zone. And, if that minimum can be made as close to zero as possible, then that's what we're going to aim for. We're hoping that the automation project is going to help with that quite a bit. But I don't think that we're intending to keep the magnitude of error in line with the number of TLDs. We're intending to keep the number of errors down to the absolute minimum. Period. >>LYMAN CHAPIN: And the point, of course, was that's a difference between the operation of a zone like dot com as a commercial enterprise and the operation of the root. I think we should stop here. We've run over our time limit a little bit, which I think was fine because we started a little late. The previous group overran their allotment as well. So we're just doing the same thing everyone else is. But I would encourage anyone who has remaining questions either to talk to me or Bill or Suzanne. You can't talk to her because she's left. But -- or more importantly, to keep in mind that there is a public comment site that's available. It will be up through the end of July. The reason for closing it at the end of July is that's about the last opportunity that we'll have to, you know, effectively take public comments into account in producing our report, which is due at the end of August. But scaling@ICANN.org is a good place to send any questions you didn't get a chance to ask today. With that I'll close the session. Thank you all for attending and encourage you to stick around and talk to us if you'd like to afterwards. Thanks. >>BILL MANNING: Thank you. [Applause]