UPSB v3 Archive | [project][4.11] Optimal Tournament Structures and Ranking Systems

chrisphd

Date: Thu, Mar 12 2009 09:58:34

Optimal Tournament Structures and Ranking Systems

Field Manager: Chrisphd

Field Description: This generalized field of study involves investigations into various possible tournament formats that may be used in pen spinning competitions. The field of study will also encompass investigations into inter and intra forum ranking and seeding systems for pen spinning competitions. Of particular interest is the method of applying intra-forum ranking systems to seed determination in global (inter-forum) pen spinning competitions.

Field Aims: This field, in part, aims to ultimately determine an optimum tournament structure that is both time efficient, fair and appealing to both pen spinners and non-pen spinners alike. A logical requirement in ensuring a tournament is fair is the development of a reliable method for seeding determination in tournaments.

Example Projects:
Some example projects in this field include:

1) An analysis of a certain type of tournament structure which may include the benefits and disadvantages of the structure as well as improvements that can be made to the structure.

2) Invention and description of a new tournament structure.

3) Ideas for seeding systems that allow seeds to be determined in international tournaments via a system that takes into account your ranking in a particular forum and the strength of the forum as a whole.

Zombo

Date: Thu, Mar 12 2009 18:12:33

we need something flexible enough such that each community can govern their own local battles anyway they want such that their ranking is comparable on a global level.

Mats

Date: Tue, Mar 17 2009 18:49:51

Global level ranking system? Could we not use a system similar to that used in chess for that?

more info

Awesome

Date: Tue, Mar 17 2009 21:04:32

Well we already have a ladder rating system, but not enough ladder battles happen for it to mean anything

Zombo

Date: Tue, Mar 17 2009 21:25:06

QUOTE (Awesome @ Mar 17 2009, 05:04 PM) <{POST_SNAPBACK}>

Well we already have a ladder rating system, but not enough ladder battles happen for it to mean anything

the ladder rating only applies to UPSB

we want an international system such that people are ranked locally but also have a global rank.

this can then be used to seed players in tournaments to avoid random draws.

Awesome

Date: Tue, Mar 17 2009 22:48:01

Yeah, but if a ladder system isn't active, how could you expect people to participate in something global.

That said it would be sweet if we could get a international ladder, maybe on its own site where everyone could go to issue challanges. I doubt we could get something like that to work though.

But Mats idea of a chess rating should work, since it was designed for basically the same thing as this, considering most battles are 1v1.

Zombo

Date: Tue, Mar 17 2009 22:57:55

just cuz ppl are not active now doesn't mean we can't discuss a global ranking system.

Awesome

Date: Tue, Mar 17 2009 23:09:54

Lol I know, just forget my first post here and we should be all good

but the chess system should work, it was developed a long time ago and is still used today, so it works there obviously, we can use that system if we can get all the boards to agree to using it, then every spinner would have a rating according to his skill, then its easy to see how a spinner compares to any other spinner based on their ratings, provided that they battled enough opponents to allow for an accurate rating.

The only thing I could think of is that chess players play more chess games then spinners battle, so it might not be as accurate for our uses.

Mats

Date: Tue, Mar 17 2009 23:14:33

QUOTE (Awesome @ Mar 17 2009, 11:09 PM) <{POST_SNAPBACK}>

Lol I know, just forget my first post here and we should be all good

but the chess system should work, it was developed a long time ago and is still used today, so it works there obviously, we can use that system if we can get all the boards to agree to using it, then every spinner would have a rating according to his skill, then its easy to see how a spinner compares to any other spinner based on their ratings, provided that they battled enough opponents to allow for an accurate rating.

The only thing I could think of is that chess players play more chess games then spinners battle, so it might not be as accurate for our uses.

The fact that people would have to do a certain minimum number of battles in order to get a rating might encourage them to battle more?

Zombo

Date: Wed, Mar 18 2009 03:46:56

just wondering in Chess, how does the rating work anyway?

there are numerous chess clubs around the world and if you only play locally, how does your elo rating mean anything internationally? from what I understand everybody start with 1000? is ther some kind of objective standard

Mats

Date: Wed, Mar 18 2009 08:03:24

Well when you play at a chess club, generally, people there will play at tournaments and such and so will play people from a large area (from around the whole country, although perhaps not quite the whole country in place as large as Canada). There may also be players who will occasionally participate in international tournaments too. This means that even if you only play locally, you will be playing people who have had their ratings affected by others far removed from the club.

I think in pen spinning the same would apply. For instance, if every forum only had their own ladder, however, sometimes people battled spinners from other ladders, so long as this happened often enough, the ratings should balance out internationally.

If this were not to happen and each forum only had their own ladder, then the ratings would only be accurate to that board and you could not compare players from different ladders based upon ratings. However, if a few battles were done between spinners from the two boards, then the ratings difference could be worked out. For example, if JEB and UPSB both had ladders, and during several battles the JEB spinner was more likely to win, although their rating was lower, by looking at the ratings of those who won/lost/drew and taking into account the differences between them and deviation from expected results, we would know that a rating of say 1500 in Japan would be roughly equal to a rating of 1700 on UPSB.

I think in chess people have an assumed rating the same as the first opponent they play. If I play an opponent of strength 1450 then that will be my assumed rating. Then if I draw, 1450 will be my rating. If I win it will be pushed higher than this and if I lose it will go lower. I think about 8-10 matches are required before a rating is accurate. Until then, it is merely provisional.

chrisphd

Date: Wed, Mar 18 2009 09:30:07

Global Seeding System Possibilities - Draft
What is required is a similar structure to what is used in the Queensland education system, where students are ranked firstly within their school. Once students have been ranked in their school, all students in queensland will sit a test, called the QCS test. The test is not to determine the individuals rank, but the test is used to determine the rank of the school. So for each school, an average QCS test result is determined, and the schools are then giving weighting points depending on their average QCS result. There is then a mathematical calculation which uses the position a student is ranked within a school and the schools strength, to determine the overall rank of a student in queensland.

The above system would be the ideal system. However this is probably impractical in pen spinning, as the above system would require every single competitor in every forum to submit a video to be judged. This would be very time consuming, and I believe more efficient procedures can be used to approximate the ideal system mentioned above.

A more practical option utalising this seeding concept would be if each board submitted selected videos from their participants, and these videos are then judged in accordance with a numeric grading system of some kind (Further research may also be required to determine a suitable grading system). Each board is then given a strength weighting, which is a function dependent on the numeric gradings of the individuals in the board who submitted videos.

The next step in this plan is then to decide on which participants will be allowed to represent their board with a video to be judged. An optimal situation would require each board to firstly rank their own board members. Once this is completed, the board should divide their members into 5 categories. Categories will be A,B,C,D and E (It is most benificial if the categories are assigned to each member is confidential and known only to board executives, so as to not upset board members who do not like to be classed as E for example). Once this is done, each board should select 1-2 members from each category and submit one of their videos for judging. A mathematical function will then be generated representing the overal strength of the board based on these results. If it is decided that a function will be chosen to represent each board, the function could be created based on average and standard deviation values of the members of a board, so that upon input of a member's position in a certain board, the function can generate an expected value of a member's numeric score. This plan so far requires the honesty of each individual board in that they do not submit A grade spinners in the C category merely as a means to improve their board strength. This problem may be resolvable if it becomes compolsory for each board participating in this system, to publicly show the rankings of their members.

This concludes my first post on global seeding systems.
Chrisphd

Mats

Date: Wed, Mar 18 2009 10:16:31

I don't know if rating a whole grade of pen spinners from one or two combos done by spinners of that 'grade' is very wise. There is a huge variation in quality and difficulty of combos from individual pen spinners. It's quite likely that spinners from one grade will be equalled or surpassed in terms of quality of combo by spinners in a lower grade, simply because the ones in the lower grade put more time and effort into the combo. I think performance over time analysis is the only way to do it due to this huge difference in combo quality.

chrisphd

Date: Wed, Mar 18 2009 11:41:37

QUOTE

don't know if rating a whole grade of pen spinners from one or two combos done by spinners of that 'grade' is very wise.

The essence of this plan isn't to rate the grades of the spinners and then compare the grades of one board with the grades of the other. Grade A in one board may be the equivalent of grade B in another board, but we are not trying to determine that with this system. The purpose of the system however is to obtain a distribution of the pen spinners according to ability for a particular board. I believe that since we are examining only board at a time and treating the board individually, that only a few members performances from each grade is enough to estimate a distribution. However you are right that more performances should be submitted from each grade to get a more accurate estimate of the distribution, and the number of members chosen will ultimately be determined on what is most practical. Other ideas to consider to improve this method might be to submit the best video of a particular from a specified grade etc. And i'm open to other possibilities in the process of selecting which videos wil be evaluated.

chrisphd

Date: Wed, Mar 18 2009 12:58:37

QUOTE

It's quite likely that spinners from one grade will be equalled or surpassed in terms of quality of combo by spinners in a lower grade, simply because the ones in the lower grade put more time and effort into the combo.

Why wouldn't good spinners put alot of time and effort into their combos?

Mats

Date: Wed, Mar 18 2009 13:05:46

QUOTE (chrisphd @ Mar 18 2009, 12:58 PM) <{POST_SNAPBACK}>

Why wouldn't good spinners put alot of time and effort into their combos?

The spinners would put in a variable amount of effort. I'm just saying with so many grades that some people in lower grades will undoubtdly put in more effort than some of the spinners in higher grades, which will distort comparisons. The variation in standard of a single combo or even 2-3 combos is too great for the grades to be judged from this. Can you imagine a chess player being rated from only 3 games, a basketball team considered best in the league after just 3 games etc?

chrisphd

Date: Wed, Mar 18 2009 13:39:39

QUOTE

The spinners would put in a variable amount of effort. I'm just saying with so many grades that some people in lower grades will undoubtdly put in more effort than some of the spinners in higher grades, which will distort comparisons. The variation in standard of a single combo or even 2-3 combos is too great for the grades to be judged from this. Can you imagine a chess player being rated from only 3 games, a basketball team considered best in the league after just 3 games etc?

Alright. The "grades" are determined solely by the rank of the member in the board. For example if there are 15 members in a board, the top 3 are grade A, the next 3 are grade B and so forth. The spinners grade is therefore not determined by 2-3 combos, but by whatever ranking system the specific board choses to use to determine the rank of the board's spinners. The grades only come in to play when trying to analyse the overal strength of a board. Now say we select 2 members per grade per board, in total, the strength of each board would then be based on 10 spinners represeting that board.

Fire Ant

Date: Wed, Mar 18 2009 13:49:56

I agree with Chrisphd on this topic. Using his method, it would be possible to rank people within boards and across boards at the same time, effectively killing two birds with one stone, as there would be no need for 2 different ranking systems, but rather a universal ranking system. I beleive this method is one of the fairest universal ranking systems that could be used in this situation.

Like chrisphd said, the main point of this plan is to get the structure under way. More specific details such as number of combos to submit can be dealt with at a later date.

Awesome

Date: Wed, Mar 18 2009 14:07:16

But pen spinning changes so quickly, how often would you be willing to go though whole process of getting every board to submit a video and judge them. Also why do we need A B C D E class? Why are we concerned about best noob (E) or best kind of not noob but stilll pretty noobie (D). The bottom spinners don't determine the strength of a board at all IMO. Also not everyone is going to consider the classes the same way so one board might just pick way stronger spinner for class E because they thought class E might mean a average 3 month spinner and another thought it was for an average 1 month spinner. I think the class system is to arbitrary to be used for actual ratings.

I like the idea of using a chess system, why make a new system when there is already one that has been designed already that has been proven to work. Also if you wanted to incorporate the A B C D E classes you could say certain ratings might put you into a certain class (a 2400 spinner is A class 2000 might be B etc). Of course you would probably want to look at how it balances out before deciding what rating is what.

chrisphd

Date: Wed, Mar 18 2009 14:12:27

QUOTE

Why are we concerned about best noob (E) or best kind of not noob but stilll pretty noobie (D).

This is required to gain an idea of the distribution of pen spinning ability throughout a board. This is important, because just because a pen spinning board might have some very strong spinners, the medium level spinners of the same board may be extreemely weak, and it would be unfair to give these weaker spinners a higher seeding position than they deserve in a tournement just becuase they belonged to a board which had some very good spinners. Instead, if a distribution of a board is obtained, which is possible through the grading system, these medium spinners will be weighted according to their position in the board, and the strength distribution function associated with that board.

chrisphd

Date: Wed, Mar 18 2009 14:17:09

QUOTE

Also not everyone is going to consider the classes the same way so one board might just pick way stronger spinner for class E because they thought class E might mean a average 3 month spinner and another thought it was for an average 1 month spinner.

You have misunderstood the grading concept here.
This system has 3 steps:
Step 1: A board ranks all pen spinners who which to be a part of the system in order from 1 to n.
Step 2: The board divides the number of participants by 5. Let this number be y.
Step 3: The board then allocates the top y participants as being grade "A", and the next top y participants as being grade "B" etc.

If this system is followed, the grading allocation is no longer subjective. Boards will not be able to invent their own classifications as to what is A grade or B grade. These grades are defined as the top 20% percent, top 20-40% etc respectively. What each board does have control over however is how it determines the rank order of its individual members.

Outsmash

Date: Wed, Mar 18 2009 14:19:33

QUOTE (Awesome @ Mar 18 2009, 08:37 PM) <{POST_SNAPBACK}>

But pen spinning changes so quickly, how often would you be willing to go though whole process of getting every board to submit a video and judge them. Also why do we need A B C D E class? Why are we concerned about best noob (E) or best kind of not noob but stilll pretty noobie (D). The bottom spinners don't determine the strength of a board at all IMO. Also not everyone is going to consider the classes the same way so one board might just pick way stronger spinner for class E because they thought class E might mean a average 3 month spinner and another thought it was for an average 1 month spinner. I think the class system is to arbitrary to be used for actual ratings.

I like the idea of using a chess system, why make a new system when there is already one that has been designed already that has been proven to work. Also if you wanted to incorporate the A B C D E classes you could say certain ratings might put you into a certain class (a 2400 spinner is A class 2000 might be B etc). Of course you would probably want to look at how it balances out before deciding what rating is what.

I don't think he meant E to be like the beginner level. It would be appropriate if E signified the people within the beginner-intermediate category (Beginner < Intermediate < Pro).

And stop Double-Posting please.

Awesome

Date: Wed, Mar 18 2009 14:29:43

Alright but E would mean the people would be in the bottom 1/5 of the board. Also consider how many members there are in each board, UPSB could easily get 50 videos if enough people care. Judging all 50 and outting them in a big order of 1 -> 50 would be a huge task and consider how much a spinner can improve in 6 months, this would have to be done regularly on every board to work.

I like the idea of every spinner getting a number based on all his performances instead of just one. Higher the number, better the spinner.

chrisphd

Date: Wed, Mar 18 2009 14:59:14

QUOTE

I like the idea of every spinner getting a number based on all his performances instead of just one. Higher the number, better the spinner.

As mentioned earlier, this idea will only work if there is enough inter-penspinning board competitions, however this idea can be potentially very useful as shown below.

We currently have discovered two problems:
Problem 1
- Chess rating system only works well if there is inter-penspinning board competitions held regularly and requiers all boards to adopt the same system.
Problem 2
I agree it will be a hard task ranking every spinner in the board if there are 50 spinners, such as a board like UPSB.

Solution:
The solution to the problems is that for UPSB, in order to rank the 50 or so members, we utalise the chess like system. Once we then have ranked our 50 members, we can determine the grades of members (ie, A, B, C etc.), and then we can submit 2 or so from each group for judging. So basically it now becomes possible for a board like UPSB to rank all its members and therefore the seeding system i proposed earlier can still apply for comparing UPSB with other boards that will utalise a different system for ranking its members.

In other words, the chess like system will be the local ranking system, and my proposed system will be used for determining global seedings.

Sadistic

Date: Wed, Mar 18 2009 14:59:58

I just wanted to insert a quick question: How will this rating system be handled in 1v1 battles exactly? If say, for example, S777 and Eriror battle, and Eriror wins at UPSB, but S777 wins at FPSB, who is given what rating?

chrisphd

Date: Wed, Mar 18 2009 15:03:13

QUOTE

I just wanted to insert a quick question: How will this rating system be handled in 1v1 battles exactly? If say, for example, S777 and Eriror battle, and Eriror wins at UPSB, but S777 wins at FPSB, who is given what rating?

If you are refering to the rating system i have proposed, then it would be inapplicable in this situation. My rating system is designed so that boards can rank their own members, and when there is a world tournament, the seedings of the competitors can then be determined by comparing a participants rank in his board and the strength of the participants board. The system isn't concerned with generating a global ranking of pen spinners, but concerned only with seeding players for global tournaments.

Mats

Date: Wed, Mar 18 2009 15:53:25

QUOTE (chrisphd @ Mar 18 2009, 02:59 PM) <{POST_SNAPBACK}>

As mentioned earlier, this idea will only work if there is enough inter-penspinning board competitions, however this idea can be potentially very useful as shown below.

We currently have discovered two problems:
Problem 1
- Chess rating system only works well if there is inter-penspinning board competitions held regularly.

And your solution:

QUOTE

If this were not to happen and each forum only had their own ladder, then the ratings would only be accurate to that board and you could not compare players from different ladders based upon ratings. However, if a few battles were done between spinners from the two boards, then the ratings difference could be worked out. For example, if JEB and UPSB both had ladders, and during several battles the JEB spinner was more likely to win, although their rating was lower, by looking at the ratings of those who won/lost/drew and taking into account the differences between them and deviation from expected results, we would know that a rating of say 1500 in Japan would be roughly equal to a rating of 1700 on UPSB.

--------------------------------------------------------------------------------------------------------------------------------------------

QUOTE

requiers all boards to adopt the same system.

Well surely all methods require this if one is to compare spinners from each board!

chrisphd

Date: Wed, Mar 18 2009 21:05:22

QUOTE

Well surely all methods require this if one is to compare spinners from each board!

No my seeding system doesn't require the assumption that all boards used the same methods to rank their members. My seeding system only assumes that each board does successfully manage to rank their members.

Awesome

Date: Wed, Mar 18 2009 21:34:09

QUOTE (Mats @ Mar 18 2009, 11:53 AM) <{POST_SNAPBACK}>

Well surely all methods require this if one is to compare spinners from each board!

Were you quoting me, that sounds like something I typed here >_<. and to that, good point XD

QUOTE (chrisphd @ Mar 18 2009, 05:05 PM) <{POST_SNAPBACK}>

No my seeding system doesn't require the assumption that all boards used the same methods to rank their members. My seeding system only assumes that each board does successfully manage to rank their members.

Whatever system you use, you will have to convince each board to use it, yours seems like it is a lot of effort for it to be accurate so boards will be less inclined to use it, where as the chess rating seems more of an implement and leave solution.

Zombo

Date: Thu, Mar 19 2009 00:09:42

i havent read everything yet, but let me lay out some ground assumptions for pen spinning:

1) Ranking spinners locally is a seperate issue. I believe in giving communities independence in how they rank their own spinners. How we rank here at UPSB would be another issue. So just assume that every community can provide you a list of their spinners in ranked order. So the system should not specify local ranking.

2) We are only interested in the top spinners of each community, say the top 8. It is not necessary to extend the scope of the ranking a lot further than that.

3) The ranking doesn't have to be precise. There could still be some element of randomness during the drawing process. For instance, in a 6 round 64 players tournament, the first 2 rounds will have very varied skill level. After that, all battles will be tough regardless of draw. An easy first round seeding procedure could seed 32 players, each of them battling against a random unseeded opponent from the bottom 32.

-- at the heart there are 2 problems, but they could be solved using a single solution.

1. allocation of spinners per community: for WT09 we asked communities how many they wanted and tried to fit the best we could. wasn't ideal and some communities complained, naturally.
2. determine the seed ranks.

Now of course, if you have a ranking that allows you to tell who the top 64 spinners in the world are, than the #1 problem is de facto solved by admitting all those 64 spinners to the next WT, without even thinking about allocation per community. But maybe you don't, in which case you allocate spots per community, then decide from what they allocated, how you seed them.

schlynn

Date: Thu, Mar 19 2009 01:35:13

I agree with awesome. The chess ranking system has been around since 1960.(before the log equations started being used) It has proven its worth on a global scale. And if people don't want to compete regularly, then there battles don't matter, because if you want to become good at anything you have to practice a lot and higher level people should want to battle more. I'm not saying that someone should have to reach a certain amount of battles to be counted, but what I am saying is that someone that wants to go up in the rankings will obviously battle more people. This is for global. I agree with zombo that each board can come up with there own way of deciding who there top people are.

chrisphd

Date: Thu, Mar 19 2009 04:56:09

I agree with the chess rating system as a means for UPSB to rate their members, however it will not work globally because it requires members from different boards to be battling each with the same frequency as the members who battle each other from the same board.'

QUOTE

Whatever system you use, you will have to convince each board to use it, yours seems like it is a lot of effort for it to be accurate so boards will be less inclined to use it, where as the chess rating seems more of an implement and leave solution.

This is wrong. We do not need to convince each board to adopt a system to rank their members. We allow them to rank their members anyway they please. Once their members are ranked, only then do we take the top 20% of members and say they are A grade for example, and so forth as per my very first post. The only effort therefore required by each board is to make sure they have a ranking list of members. Each board can then adopt the chess system to determine the member rankings for their board. All i'm saying however is that the chess system still doesn't allow us to compare members from seperate boards, as members usually only battle between other members from the same board, and thus a player with a high chess rating from a weak board might still be worse than a player with a low chess rating from a strong board, and thus the initial problem still occurs.

Fire Ant

Date: Thu, Mar 19 2009 07:26:40

QUOTE (Awesome @ Mar 19 2009, 12:07 AM) <{POST_SNAPBACK}>

...I like the idea of using a chess system, why make a new system when there is already one that has been designed already that has been proven to work...

Chrisphd's system has already been proven to work. As he said in his first post about the topic, it is used by Queensland Education in Queensland schools to rank students across schools. These schools all have different ways of ranking their students, as each school will write its own exam to test its own students. So this is a method that has been approved of by a Government of a State. It definately works, I've been through it.

Zombo

Date: Thu, Mar 19 2009 11:40:57

some problems that I see:

- Star factor: some communities have a few really good spinners above everybody. The letter allocation shouldn't be equal, since there are very few good spinners, moderate number of good spinners, and a lot of average spinners and newcomers. You don't want to drag down someone's ranking just because his community is not as good.

- Scope: what would happen if we only compare grade A people from every community, does it still work

chrisphd

Date: Thu, Mar 19 2009 12:17:19

Problem 1: Star Factor
The reason for the letter allocation is so that board strength isn't measured by a single number, but rather the board will have a strength value allocated to each letter grade of the board or perhaps even a continuous distribution function. An example, a certain board may have a strength weighting of 10 for grade A members and a strength weighting of 5 for grade B members etc. Furthermore, there will be a function that determines the strength of an individual, based on the individuals ranking in his/her letter grade and the strength of that letter grade for that forum.

How does this resolve the problem?
This way if a board is weak, but has some star players, the star players will give the board a high strength value for its A grade, but the board will still have a weak value for its B and C grade players etc.

Zombo

Date: Thu, Mar 19 2009 12:50:47

QUOTE (chrisphd @ Mar 19 2009, 08:17 AM) <{POST_SNAPBACK}>

Problem 1: Star Factor
The reason for the letter allocation is so that board strength isn't measured by a single number, but rather the board will have a strength value allocated to each letter grade of the board or perhaps even a continuous distribution function. An example, a certain board may have a strength weighting of 10 for grade A members and a strength weighting of 5 for grade B members etc. Furthermore, there will be a function that determines the strength of an individual, based on the individuals ranking in his/her letter grade and the strength of that letter grade for that forum.

How does this resolve the problem?
This way if a board is weak, but has some star players, the star players will give the board a high strength value for its A grade, but the board will still have a weak value for its B and C grade players etc.

ok but what if the A grade contains 20 ppl, out of the 20, 5 are insane, 15 just good.

if the A strengith is strong: 15 ppl are overrated
if the A strength is weak: 5 ppl are underrated.

in UPSB this is what we have: a lot of members, 3-4 elite spinners, a dozen of pretty good people, and the rest is jus average or above average.

chrisphd

Date: Thu, Mar 19 2009 13:17:48

QUOTE

ok but what if the A grade contains 20 ppl, out of the 20, 5 are insane, 15 just good.

if the A strengith is strong: 15 ppl are overrated
if the A strength is weak: 5 ppl are underrated.

in UPSB this is what we have: a lot of members, 3-4 elite spinners, a dozen of pretty good people, and the rest is jus average or above average.

I believe this issue could be resolved by allowing boards who have many participants to divide their members into more than just 5 classes (ie, A, B, C, D and E). Allowing a board to divide their members into more classes means that the board will obtain a better and more accurate strength distribution function. Because classes are not directly compared between boards, it is ok if one board has more division classes than another board. It is reasonable however to ensure each board has a minimum number of classes (eg, 5 classes), to prevent boards with a few strong players from misrepresenting the strength of their weaker players.

Awesome

Date: Thu, Mar 19 2009 17:14:45

Instead of just taking the top 20% for the A class couldn't you come up with another method, a board for example could make its A class only the elite spinners, or maybe have a special elite class which only accepts the really pro spinners, with a special way to determine who gets in. I don't think the top class should work on a percentage of board members.

iMatt

Date: Thu, Mar 19 2009 18:05:52

This entire class thing just doesn't work. What your trying to do is quite literally do global stat tracking. This is being WAY over-thought.

My simple proposal. Keep our ladder system.

Just make a new page dedicated to stat tracking. Just make a separate ladder for global. Then get a dedicated stat tracker researcher/moderator *shrugs*.

Contact other moderators from other boards who submit their stats. The global stats would be displayed as an extension of UPSB.

The other part is you simply can't rate each board with a grade or overall rank. The amount of spinners from each varies dramatically which will always skew any realistic stats.

Just a thought.

Sadistic

Date: Thu, Mar 19 2009 18:38:38

QUOTE (iMatt @ Mar 19 2009, 01:05 PM) <{POST_SNAPBACK}>

The other part is you simply can't rate each board with a grade or overall rank. The amount of spinners from each varies dramatically which will always skew any realistic stats.

Well I think this is an issue we will need to cover if we wish to appropriately seed boards/members in tournaments, as different boards do not necessarily take part in battles as often as others would.

Awesome

Date: Thu, Mar 19 2009 18:54:58

QUOTE (iMatt @ Mar 19 2009, 02:05 PM) <{POST_SNAPBACK}>

This entire class thing just doesn't work. What your trying to do is quite literally do global stat tracking. This is being WAY over-thought.

My simple proposal. Keep our ladder system.

Just make a new page dedicated to stat tracking. Just make a separate ladder for global. Then get a dedicated stat tracker researcher/moderator *shrugs*.

Contact other moderators from other boards who submit their stats. The global stats would be displayed as an extension of UPSB.

The other part is you simply can't rate each board with a grade or overall rank. The amount of spinners from each varies dramatically which will always skew any realistic stats.

Just a thought.

Thast was my initial suggestion, but I kind of wrote it off, thats probably the simplest solution and could work out nicely.

QUOTE

Well I think this is an issue we will need to cover if we wish to appropriately seed boards/members in tournaments, as different boards do not necessarily take part in battles as often as others would.

They only need to battle enough to give themselves an accurate rating, and we could just seed members from the global ladder system

Zombo

Date: Thu, Mar 19 2009 19:52:37

ok another assumption is that right now,

we only rank ppl for tournament purpose, so we dont need a persistent ranking,
only a ranking that we do before the tournament.

later on, i think it'd be cool to have like a global ladder system, and to enter the ladder, you must be in the top X of your local community, and if you suck too much in the global ladder, someone can take your spot.

schlynn

Date: Fri, Mar 20 2009 02:29:59

QUOTE (Zombo @ Mar 19 2009, 03:52 PM) <{POST_SNAPBACK}>

ok another assumption is that right now,

we only rank ppl for tournament purpose, so we dont need a persistent ranking,
only a ranking that we do before the tournament.

later on, i think it'd be cool to have like a global ladder system, and to enter the ladder, you must be in the top X of your local community, and if you suck too much in the global ladder, someone can take your spot.

Sounds good to me.

hoiboy

Date: Sat, Mar 21 2009 04:45:14

What about adding in the entire scoring system? Like have a preliminary round and take the top 2^x scorers and seed them accordingly. That way, boards can keep their own seperate ladders and tournaments are free for all. Team based tournaments is different though... >.>

Zombo

Date: Sat, Mar 21 2009 14:05:26

this preliminary round would be very difficult on the judges

not only they need to judge a lot of videos, but they would need their score to be perfect such that you can compare any 2 ppl together and make sense.

Fire Ant

Date: Mon, Mar 23 2009 14:06:31

I think people are writing off chrisphd's method because they don't understand it fully/sound too complex. I think chrisphd should provied a full worked example. It would clear alot of things up and everyone would be able to see it and see how it is really a optimal solution.

schlynn

Date: Mon, Apr 13 2009 02:35:57

For a way that would super easy to implement, to rank people. I came up with a simple equation. It rewards people that battle more and people that win more. Here is a pic of the equation that way you can tell what I mean if my explanation isn't clear.

The equation is the square root of the log of a weighted harmonic mean.

How the equation works:

You take the amount of total battles. That value is n. Then a win's value is 1, and a lose's value is 0. So lets say someone's record is 3 wins and 2 loses.

The denominator for the denominator of the weighted harmonic mean is then 3. Because 1+1+1+0+0=3. Divide there total number of games by 3. So 5/3=1.6... Then n+1 in this case=6. 5 total games + 1=6. 6/1.6...=~3.6. Take the log of 3.6=~.5563. Then you take the square root of that. And that value gives us ~.74586.

If you calculate erirors rank you get ~1.097. He had 15 wins last time I checked. This equation works out very well. Because if someone has more wins than loses they can have the same rank as someone that has played more total battles but hasn't won as much. And if you guys don't want all the decimals just put like 1000*(the equation). That way we get numbers that we can use. This way seems really fair. Its almost like magic how this works.

EDIT:
Here is a claculator that I made to find the rank with this equation.

Zombo

Date: Mon, Apr 13 2009 03:36:01

how did you derive this formula, is this used anywhere else?

also the purpose of this is to basically replace the current ladder formula?

schlynn

Date: Mon, Apr 13 2009 11:13:47

QUOTE (Zombo @ Apr 12 2009, 11:36 PM) <{POST_SNAPBACK}>

how did you derive this formula?

I was learning about harmonic means and the fact that you can weight them so easily surprised me I did some testing and then I came up with the n+1 to make the numbers more manageable. The log came from the fact that it is used in other ranking systems. And then the square root was just to get the number smaller. But we can play with the square root and the idea I suggested earlier by 1000*(equation) to get numbers that are not decimals.

QUOTE (Zombo @ Apr 12 2009, 11:36 PM) <{POST_SNAPBACK}>

is this used anywhere else?

As far as I know of this is a original equation.

QUOTE (Zombo @ Apr 12 2009, 11:36 PM) <{POST_SNAPBACK}>

also the purpose of this is to basically replace the current ladder formula?

If this works then yes. Because right now all the ladder system does is compare the win/lose ratio right?

Zombo

Date: Mon, Apr 13 2009 13:24:17

the ladder looks at the current rating of both spinners and adjust

Points gain by the winner = 100 / (X^3 + 1),
where X is the ratio:
Rating of the winner / Rating of the loser (prior to the battle)

the problem with your equation is that ti doesnt evaluate the quality of the wins and losses. you can have win 3 wins 2 losses against weak opponents only, ony 3 wins 2 losses against strong opponents.

schlynn

Date: Mon, Apr 13 2009 15:05:56

QUOTE (Zombo @ Apr 13 2009, 09:24 AM) <{POST_SNAPBACK}>

the ladder looks at the current rating of both spinners and adjust

Points gain by the winner = 100 / (X^3 + 1),
where X is the ratio:
Rating of the winner / Rating of the loser (prior to the battle)

the problem with your equation is that ti doesnt evaluate the quality of the wins and losses. you can have win 3 wins 2 losses against weak opponents only, ony 3 wins 2 losses against strong opponents.

Allright, I'll redo the equation and use that value then. Shouldn't be to hard. It will probably have something to do with the harmonic mean.

hoiboy

Date: Tue, Apr 14 2009 05:48:22

Has anyone thought of ranking people in the WT? I know for the past 2 ones, the boards have not developed enough to do something as sophisticated as that. But now, we've gotten to the point where an International Ladder or whatever could be set up, and seed the top 64 accordingly. That way, the arguably 1st and 2nd best spinners don't face each other in R1, which could happen if you drew randomly from a hat.

Also, why not modify the double/triple/whatever big number elimination system. Like R2, there is a winner/loser bracket, and the winners from R1 have to lose twice while the losers from R1 have to lose 3 times total (or however many is necessary). I would draw up a chart, but it's 10:30 at night right now... >.> It would give more accurate standings for the WT overall.

8 person tourny example:

Round 1:
Match 1a: 1 vs. 8
Match 1b: 3 vs. 6
Match 1c: 4 vs. 5
Match 1d: 2 vs. 7

Round 2: (W)
Match 2a: (W) Match 1a vs. (W) Match 1b
Match 2b:(W) Match 1c vs. (W) Match 1d

Round 3: (W, W)
Match 3a: (W) Match 2a vs. (W) Match 2b
(Winner is 1st, Loser is 2nd)

Round 3: (W,L)
Match 3b: (L) Match 2a vs. (L) Match 2b
(Winner is 3rd, Loser is 4th)

Round 2: (L)
Match 2c: (L) Match 1a vs. (L) Match 1d
Match 2d: (L) Match 1b vs. (L) Match 1c

Round 3: (L,W)
Match 3c: (W) Match 2c vs. (W) Match 2d
(Winner is 5th, Loser is 6th)

Round 3: (L, L)
Match 3d: (L) Match 2c vs. (L) Match 2d
(Winner is 7th, Loser is 8th)

*In R1, the 1-8 represents the relative ranking of the 8 people at the start of the tournament. After that, ranking doesn't really matter.

In the long run, it'll all work out, except for the judges, having to grade so many videos.
This can be expanded as much as needed, and I'm working on a 64 person version.

Zombo

Date: Tue, Apr 14 2009 12:10:28

uhh...

that's the whole point of this topic?

to be able to rank people for the next WT...

hoiboy

Date: Tue, Apr 14 2009 16:56:42

I know. That was a suggestion on how to run the tournament... although the judges would need to grade 64 videos every round, and the whole thing would run up to quadruple elimination/6 rounds.

Will this be set up like the UPSB ladder (with schlynn's formula)? I remember something about the International Pen Spinning League that died...

schlynn

Date: Wed, Apr 22 2009 03:39:38

Ok, sorry for my absence. But to throw in the ratio. Just add that as the weighted value in the formula.

So it will be the square root of the log of the ratio divided by the harmonic mean of n/x_1+x_2... still. So instead of having n+1 as the weighted part of the formula just have it be the ratio.

Awesome

Date: Wed, Apr 22 2009 20:53:08

Whats wrong with the current ladder rating formula, its a lot simpler, yours looks to complex, while filling the same function

also with yours it wouldn't be a linear system (since logs and square root graphs curve), the difference between a 800 to 1000 would be less then 1000 to 1200, so that would make ratings make less sense then a linear system where you have a formula to figure out how much you gain from a match and just add that to your current rating.

And what do you mean by having n+1 just be the ratio, doesn't your formula determine the rank of the spinner with out requiring a previous rating (which is a pretty cool feature imo) you can't use the ratio between to spinners.

If I completely misunderstood your formula then its your own fault for putting in logs

hoiboy

Date: Wed, Apr 22 2009 23:23:38

Um, I think his formula puts into account how often you battle, who you battle, and all that stuff. The current one I believe is just win/loss score.

schlynn

Date: Wed, Apr 22 2009 23:34:09

QUOTE (Awesome @ Apr 22 2009, 04:53 PM) <{POST_SNAPBACK}>

Whats wrong with the current ladder rating formula, its a lot simpler, yours looks to complex, while filling the same function

also with yours it wouldn't be a linear system (since logs and square root graphs curve), the difference between a 800 to 1000 would be less then 1000 to 1200, so that would make ratings make less sense then a linear system where you have a formula to figure out how much you gain from a match and just add that to your current rating.

And what do you mean by having n+1 just be the ratio, doesn't your formula determine the rank of the spinner with out requiring a previous rating (which is a pretty cool feature imo) you can't use the ratio between to spinners.

If I completely misunderstood your formula then its your own fault for putting in logs

Zombo wanted my equation to take into account the fact that say someone new beat Eriror for example. But that best way to do that would probably be by instead of having the ratio just have your rank minus the other persons total wins as the weighted number. That way if you beat someone better than you it will weight it more and if you beat someone lower than you it won't raise it as much. I think that this will make it very nice. Maybe a few more tweaks and it will be complete.

Just was running some numbers and this doesn't work out to well. If someone has the understanding of numbers that I do want to PM me on some ideas we can talk. Goal is to take both players win/lose ratio into account into the equation so that if you beat someone lower you don't go up as much and if you beat someone higher you go up more. I'm almost positive that it will be the weighted numbers, since it weights the value, duh

Awesome

Date: Wed, Apr 22 2009 23:37:26

QUOTE (schlynn @ Apr 22 2009, 07:34 PM) <{POST_SNAPBACK}>

Zombo wanted my equation to take into account the fact that say someone new beat Eriror for example. But that best way to do that would probably be by instead of having the ratio just have your rank minus the other persons total wins as the weighted number. That way if you beat someone better than you it will weight it more and if you beat someone lower than you it won't raise it as much. I think that this will make it very nice. Maybe a few more tweaks and it will be complete.

Just was running some numbers and this doesn't work out to well. If someone has the understanding of numbers that I do want to PM me on some ideas we can talk. Goal is to take both players win/lose ratio into account into the equation so that if you beat someone lower you don't go up as much and if you beat someone higher you go up more. I'm almost positive that it will be the weighted numbers, since it weights the value, duh

And whats wrong with the current one of 100 / (X^3 + 1) ?

schlynn

Date: Thu, Apr 23 2009 02:20:15

QUOTE (Awesome @ Apr 22 2009, 07:37 PM) <{POST_SNAPBACK}>

And whats wrong with the current one of 100 / (X^3 + 1) ?

It doesn't show the persons skill as accurately as it could.

Awesome

Date: Thu, Apr 23 2009 02:48:12

QUOTE (schlynn @ Apr 22 2009, 10:20 PM) <{POST_SNAPBACK}>

It doesn't show the persons skill as accurately as it could.

It shows it accurately relevant to other spinners, you need to look at how other spinners compare, its accurate enough that a given spinner can be ranked in a place like 1st 2nd 3rd... until you have as many places as spinners, then you can see that a spinner is on the 10th place, or has ratings close to other people.

Skill is relative to the environment, you can not evaluate the skill of a single spinner, since both will put the spinners in similar positions, they would both be accurate.

Since both would presumably rank spinners from best to worse, wouldn't it be better to take the simpler one?

(shouldn't this stuff be in its own topic, could a mod possibly move it)

schlynn

Date: Thu, Apr 23 2009 04:59:29

QUOTE (Awesome @ Apr 22 2009, 10:48 PM) <{POST_SNAPBACK}>

It shows it accurately relevant to other spinners, you need to look at how other spinners compare, its accurate enough that a given spinner can be ranked in a place like 1st 2nd 3rd... until you have as many places as spinners, then you can see that a spinner is on the 10th place, or has ratings close to other people.

Skill is relative to the environment, you can not evaluate the skill of a single spinner, since both will put the spinners in similar positions, they would both be accurate.

Since both would presumably rank spinners from best to worse, wouldn't it be better to take the simpler one?

(shouldn't this stuff be in its own topic, could a mod possibly move it)

The current equation is very simplified yes, it doesn't have inherit things that my equation does though. Such as spreading out the skill levels of the spiners on curves and such.

hoiboy

Date: Sat, Jun 6 2009 21:21:36

Just a thought, but we could run this like soccer/football format.

We got the AC, so why not add in a Americas Cup and a EC? Then you could take the top teams from each and throw them into the WC.
This is for team tourneys.

The problem is that the competitors in each cup would be of different strength, and there would be different amount of spinners.

Zombo

Date: Sun, Jun 7 2009 02:01:10

it doesnt exactly fit because asia cup had asians from UPSB but uPSB is also america, and its also europe

hoiboy

Date: Fri, Mar 26 2010 00:08:30

what's the order of priorities here?

are we leaning more toward shorter tournaments or more accurate tourney results?

Zombo

Date: Sun, Mar 28 2010 00:04:48

hummm both?

in general the big tournament WC/WT should last no more than 5 months

hoiboy

Date: Sun, Mar 28 2010 00:55:34

how about pooling rounds 1, 2, and 3?
8 spinners, 8 pools
top spinner from each pool advanced to the final 8
still 6 rounds

any emphasis on increasing the number of spinners in the tournament?

Zombo

Date: Sun, Mar 28 2010 01:06:31

yea look here:

http://www.upsb.info/forum/index.php?showtopic=18783

this is pretty much the format of the next WT

UPSB v3

Pen Spinning Relations / [project][4.11] Optimal Tournament Structures and Ranking Systems