Design and cracking of web verification code from the new verification code of 12306 website

Design and cracking of web verification code from the new verification code of 12306 website

On March 16, 2015, the official railway ticket purchase website 12306 introduced a new verification method on the login interface. After filling in the login name and password, users must accurately select the picture verification code to log in successfully. It is reported that after the revision of the 12306 verification code, all ticket grabbing tools are currently unable to log in.

What a devastating news! I believe that all major Internet companies are devoting themselves to developing new ticket-grabbing assistants to crack the new verification code mode.

Below, the editor will show you the design principles of various verification codes and how to crack them.

The first is the plain text verification code, which is a relatively primitive one.

This type of verification code does not meet the definition of verification code, because only automatically generated questions can be used as verification codes. This type of text verification code is selected from the question bank, and the number is limited. The cracking method is also very simple. Refresh a few times, build a question bank and corresponding answers, use regular expressions to grab questions from the web page, and crack them after finding matching answers. There are also some that use randomly generated mathematical formulas, such as random numbers [+-*/] random operators random numbers =?, which can be solved by programmers at the elementary school level...

This kind of verification code is not completely useless. For many spam bots that will attack as soon as they see a form, there is really no need to put so much effort into a single website. For those who are determined to flood your website with spam, this kind of verification code is the same as not having it.

The second one is the currently more mainstream picture verification code:

The principle of this type of image verification code is to increase the difficulty of recognition by sticking characters together, and the above type is generally used for small websites.

This type of verification code processing method:

Image preprocessing

How to remove background interference? You can notice that each verification code number or letter is the same color, so divide the verification code into 5 parts.

Calculate the color distribution of each area. Except for white, the color with the highest value is the color of the verification code, so it is easy to remove the background.

Code:

  1. 1. public   static BufferedImage removeBackgroud(String picFile)
  2. 2. throws Exception {
  3. 3. BufferedImage img = ImageIO.read( new File(picFile));
  4. 4. img = img.getSubimage( 1 , 1 , img.getWidth() - 2 , img.getHeight() - 2 );
  5. 5. int width = img.getWidth() ;
  6. 6. int height = img.getHeight() ;
  7. 7. double subWidth = ( double ) width / 5.0 ;
  8. 8. for ( int i = 0 ; i < 5 ; i++) {
  9. 9. Map<Integer, Integer> map = new HashMap<Integer, Integer>();
  10. 10. for ( int x = ( int ) ( 1 + i * subWidth); x < (i + 1 ) * subWidth
  11. 11. && x < width - 1 ; ++x) {
  12. 12. for ( int y = 0 ; y < height; ++y) {
  13. 13. if (isWhite(img.getRGB(x, y)) == 1 )
  14. 14. continue ;
  15. 15. if (map.containsKey(img.getRGB(x, y))) {
  16. 16. map.put(img.getRGB(x, y), map.get(img.getRGB(x, y)) + 1 );
  17. 17 . } else {
  18. 18. map.put(img.getRGB(x, y), 1 );
  19. 19. }
  20. 20. }
  21. twenty one . }
  22. 22. int max = 0 ;
  23. 23. int colorMax = 0 ;
  24. 24. for ( Integer color : map.keySet()) {
  25. 25. if (max < map.get ( color)) {
  26. 26. max = map.get(color);
  27. 27. colorMax = color;
  28. 28. }
  29. 29. }
  30. 30. for ( int x = ( int ) ( 1 + i * subWidth); x < (i + 1 ) * subWidth
  31. 31 . && x < width - 1 ; ++x) {
  32. 32. for ( int y = 0 ; y < height; ++y) {
  33. 33. if (img.getRGB(x, y) != colorMax) {
  34. 34. img.setRGB(x, y, Color.WHITE.getRGB());
  35. 35 . } else {
  36. 36. img.setRGB(x, y, Color.BLACK.getRGB());
  37. 37. }
  38. 38. }
  39. 39. }
  40. 40 . }
  41. 41. return img ;
Get the following figure

The next step is to scan the image vertically and cut it.

Scan each part horizontally

Then train

Finally, because of the fixed size, recognition is the same as verification code recognition--1, and pixel comparison is sufficient.

Source code:

  1. 1. public   class ImagePreProcess2 {
  2. 2 .
  3. 3. private   static Map<BufferedImage, String> trainMap = null ;
  4. 4. private   static   int index = 0 ;
  5. 5 .
  6. 6. public   static   int isBlack( int colorInt) {
  7. 7. Color color = new Color(colorInt);
  8. 8. if (color.getRed() + color.getGreen () + color.getBlue() <= 100 ) {
  9. 9. return   1 ;
  10. 10. }
  11. 11. return   0 ;
  12. 12. }
  13. 13 .
  14. 14. public   static   int isWhite( int colorInt) {
  15. 15. Color color = new Color(colorInt);
  16. 16. if (color.getRed() + color.getGreen() + color.getBlue() > 100 ) {
  17. 17. return   1 ;
  18. 18. }
  19. 19. return   0 ;
  20. 20. }
  21. twenty one .
  22. 22. public   static BufferedImage removeBackgroud(String picFile)
  23. 23. throws Exception {
  24. 24. BufferedImage img = ImageIO.read( new File(picFile));
  25. 25. return img ;
  26. 26 . }
  27. 27 .
  28. 28. public   static BufferedImage removeBlank(BufferedImage img) throws Exception {
  29. 29. int width = img.getWidth() ;
  30. 30. int height = img.getHeight() ;
  31. 31 . int start = 0 ;
  32. 32. int end = 0 ;
  33. 33. Label1: for ( int y = 0 ; y < height; ++y) {
  34. 34. int count = 0 ;
  35. 35. for ( int x = 0 ; x < width; ++x) {
  36. 36. if (isWhite(img.getRGB(x, y)) == 1 ) {
  37. 37.count ++;
  38. 38. }
  39. 39. if (count >= 1 ) {
  40. 40. start = y;
  41. 41. break Label1 ;
  42. 42 . }
  43. 43. }
  44. 44. }
  45. 45. Label2: for ( int y = height - 1 ; y >= 0 ; --y) {
  46. 46 . int count = 0 ;
  47. 47. for ( int x = 0 ; x < width; ++x) {
  48. 48. if (isWhite(img.getRGB(x, y)) == 1 ) {
  49. 49.count ++;
  50. 50 . }
  51. 51 . if (count >= 1 ) {
  52. 52. end = y;
  53. 53. break Label2 ;
  54. 54. }
  55. 55. }
  56. 56 . }
  57. 57. return img.getSubimage( 0 , start, width, end - start + 1 );
  58. 58 . }
  59. 59 .
  60. 60. public   static List<BufferedImage> splitImage(BufferedImage img)
  61. 61 . throws Exception {
  62. 62. List<BufferedImage> subImgs = new ArrayList<BufferedImage>();
  63. 63. int width = img.getWidth() ;
  64. 64. int height = img.getHeight() ;
  65. 65. List<Integer> weightlist = new ArrayList<Integer>();
  66. 66. for ( int x = 0 ; x < width; ++x) {
  67. 67. int count = 0 ;
  68. 68. for ( int y = 0 ; y < height; ++y) {
  69. 69. if (isWhite(img.getRGB(x, y)) == 1 ) {
  70. 70.count ++;
  71. 71 . }
  72. 72 . }
  73. 73. weightlist.add(count);
  74. 74 . }
  75. 75. for ( int i = 0 ; i < weightlist.size();) {
  76. 76 . int length = 0 ;
  77. 77. while (weightlist.get(i++) > 1 ) {
  78. 78. length++;
  79. 79 . }
  80. 80 . if (length > 12 ) {
  81. 81. subImgs.add(removeBlank(img.getSubimage(i - length - 1 , 0 ,
  82. 82. length / 2 , height)));
  83. 83. subImgs.add(removeBlank(img.getSubimage(i - length / 2 - 1 , 0 ,
  84. 84 . length / 2 , height)));
  85. 85 . } else   if (length > 3 ) {
  86. 86. subImgs.add(removeBlank(img.getSubimage(i - length - 1 , 0 ,
  87. 87 . length, height)));
  88. 88 . }
  89. 89 . }
  90. 90 . return subImgs;
  91. 91 . }
  92. 92 .
  93. 93. public   static Map<BufferedImage, String> loadTrainData() throws Exception {
  94. 94. if (trainMap == null ) {
  95. 95. Map<BufferedImage, String> map = new HashMap<BufferedImage, String>();
  96. 96. File dir = new File( "train2" );
  97. 97. File[] files = dir.listFiles();
  98. 98. for (File file : files) {
  99. 99. map.put(ImageIO.read(file), file.getName().charAt( 0 ) + "" );
  100. 100 . }
  101. 101. trainMap = map;
  102. 102 . }
  103. 103. return trainMap ;
  104. 104 . }
  105. 105 .
  106. 106 . public   static String getSingleCharOcr(BufferedImage img,
  107. 107. Map<BufferedImage, String> map) {
  108. 108. String result = "" ;
  109. 109. int width = img.getWidth() ;
  110. 110. int height = img.getHeight() ;
  111. 111. int min = width * height;
  112. 112. for (BufferedImage bi : map.keySet()) {
  113. 113 . int count = 0 ;
  114. 114. int widthmin = width < bi.getWidth() ? width : bi.getWidth();
  115. 115. int heightmin = height < bi.getHeight() ? height : bi.getHeight() ;
  116. 116. Label1: for ( int x = 0 ; x < widthmin; ++x) {
  117. 117. for ( int y = 0 ; y < heightmin; ++y) {
  118. 118. if (isWhite(img.getRGB(x, y)) != isWhite(bi.getRGB(x, y))) {
  119. 119.count ++;
  120. 120 . if (count >= min)
  121. 121. break Label1 ;
  122. 122 . }
  123. 123 . }
  124. 124 . }
  125. 125. if (count < min) {
  126. 126. min = count;
  127. 127. result = map.get(bi);
  128. 128 . }
  129. 129 . }
  130. 130. return result ;
  131. 131 . }
  132. 132 .
  133. 133 . public   static String getAllOcr(String file) throws Exception {
  134. 134. BufferedImage img = removeBackgroud(file);
  135. 135. List<BufferedImage> listImg = splitImage(img);
  136. 136. Map<BufferedImage, String> map = loadTrainData();
  137. 137. String result = "" ;
  138. 138. for (BufferedImage bi : listImg ) {
  139. 139. result += getSingleCharOcr(bi, map);
  140. 140 . }
  141. 141. ImageIO.write(img, "JPG" , new File( "result2//" + result + ".jpg" ));
  142. 142. return result ;
  143. 143 . }
  144. 144 .
  145. 145. public   static   void downloadImage() {
  146. 146. HttpClient httpClient = new HttpClient();
  147. 147. GetMethod getMethod = null ;
  148. 148. for ( int i = 0 ; i < 30 ; i++) {
  149. 149. getMethod = new GetMethod( "http://www.pkland.net/img.php?key="    
  150. 150 . + ( 2000 + i));
  151. 151. try {
  152. 152 . // Execute getMethod  
  153. 153. int statusCode = httpClient.executeMethod( getMethod );
  154. 154. if (statusCode != HttpStatus.SC_OK) {
  155. 155. System.err.println( "Method failed: "    
  156. 156. + getMethod.getStatusLine());
  157. 157 . }
  158. 158 . // Read content  
  159. 159. String picName = "img2//" + i + ".jpg" ;
  160. 160. InputStream inputStream = getMethod.getResponseBodyAsStream();
  161. 161. OutputStream outStream = new FileOutputStream(picName);
  162. 162. IOUtils.copy(inputStream, outStream);
  163. 163. outStream.close();
  164. 164. System.out.println(i + "OK!" );
  165. 165. } catch (Exception e) {
  166. 166. e.printStackTrace();
  167. 167 . } finally {
  168. 168 . // Release the connection  
  169. 169. getMethod.releaseConnection();
  170. 170 . }
  171. 171 . }
  172. 172 . }
  173. 173 .
  174. 174 . public   static   void trainData() throws Exception {
  175. 175. File dir = new File( "temp" );
  176. 176. File[] files = dir.listFiles();
  177. 177. for (File file : files) {
  178. 178. BufferedImage img = removeBackgroud( "temp//" + file.getName());
  179. 179. List<BufferedImage> listImg = splitImage(img);
  180. 180. if (listImg.size() == 4 ) {
  181. 181. for ( int j = 0 ; j < listImg.size(); ++j) {
  182. 182. ImageIO.write(listImg.get(j), "JPG" , new File( "train2//"    
  183. 183. + file.getName().charAt(j) + "-" + (index++)
  184. 184 . + ".jpg" ));
  185. 185 . }
  186. 186 . }
  187. 187 . }
  188. 188 . }
  189. 189 .
  190. 190 . /**
  191. 191. * @param args
  192. 192. * @throws Exception
  193. 193. */    
  194. 194 . public   static   void main(String[] args) throws Exception {
  195. 195 . // downloadImage();  
  196. 196. for ( int i = 0 ; i < 30 ; ++i) {
  197. 197. String text = getAllOcr( "img2//" + i + ".jpg" );
  198. 198. System.out.println(i + ".jpg = " + text);
  199. 199 . }
  200. 200 . }
  201. 201 .}

The verification codes of giants like BAT use interference lines, a mixture of bold and unbold characters, common Chinese characters (there are about 5,000 commonly used Chinese characters, with complex strokes and many similar characters, which is much more difficult than 26 letters), a mixture of different fonts, such as Kaiti, Songti, and Youyuan, pinyin, distorted fonts, and the need to accurately recognize 13 Chinese characters, which greatly increases the probability of failure.

Of course, in addition to the mainstream image verification code, some websites use voice verification codes to take care of users with poor eyesight. Generally, this kind of verification code is a machine-generated voice reading a number. However, many programmers are lazy in this regard. They find 10 sound recordings of numbers in advance, and then randomly put them together when generating them. The result is like this:

The design principle is as follows:

Overall Effect

• The number of characters is random within a certain range

• Font size is random within a certain range

•Wave distortion (random angle within a certain range)

• Anti-identification

• Don’t over-rely on anti-identification technology

• Don't use too many character sets - poor user experience

•Anti-segmentation•

Overlapping adhesion is better than interference lines

•Backup Plan

• A completely different set of verification codes of the same strength

Now that the principles are known, it becomes easy to crack them.

But the problem is, this time the verification code of 12306 is actually a picture, and none of the above methods can be used. So can it not be cracked?

Some people think that the image memory of the 12306 website is not too large, and it can be completely stripped and then cracked. Of course, this is just talk. There is a very advanced and very primitive method called "network coding" or "human coding"

Some technical experts send the verification code to their self-made "coding" software, and some "coding workers" use this program to input the verification code into the machine for automatic registration. The verification code that comes out is transmitted to the automatic registration machine to complete the verification.

At present, this simple and crude method can cope with the current situation.

Conclusion:

12306 has come up with a killer move this time, killing all ticket grabbing software at once. Even if the scalpers are unhappy, we can still buy tickets. This not only solves the scalper problem, but also poses a difficult problem for programmers.

<<:  Cyanogen's plot to steal Android from Google

>>:  Why do WP users choose the 1% life? There is a reason

Recommend

What was it like to be a teacher in ancient times?

1 Today (September 10) is the 37th Teachers' ...

How to meet user needs before an event? There are 4 ideas

This has always been a question that everyone wan...

How long can the huge profits of auto finance last?

If selling cars is not profitable, what will the ...

Dialogue with Robin Li: When we are at the intersection of the times

[[127002]] Three elements for Baidu's success...

During holiday gatherings, beware of secretly elevated uric acid levels!

《Cotton Swab Medical Science Popularization》 Chin...

The Silent “Zombie Relationships” on WeChat

Last summer, I wrote down the structure and ideas...

I'll tell you at 3 o'clock! Why does your app have no users?

Recently, I have come into contact with some App ...

The creator of Python talks about Python

According to the poster, Python creator Guido van...

Ji Zhongzhan: 13 Quick Winning Lessons in the Workplace Your Boss Won't Tell You

Course Catalog: Lesson 1-Is it better to work in ...