Jump to content

0 A.D. menu bug


Alexander Livingstone
 Share

Recommended Posts

4 hours ago, nwtour said:

This 2.5 percent is offset by more efficient indexing in the file system.

If you change instead of reading - for access to file meta-information, then the file system will play out the time saved due to defragmentation

 

10kk stat()
time perl -e 'my $i = 0;while(1) { stat("./binaries/system/pyrogenesys"); $i++; exit unless $i % 10_000_000; }'
10.47 sec

10kk tar stat
time perl -MArchive::Tar -e 'my $a = Archive::Tar->new("test.tar");while(1) { $a->contains_file("file"); $i++; exit unless $i % 10_000_000; }'
118.23 sec

 

I suppose linux does things differently too :)

4 hours ago, nwtour said:

This 2.5 percent is offset by more efficient indexing in the file system.

If you change instead of reading - for access to file meta-information, then the file system will play out the time saved due to defragmentation

 

10kk stat()
time perl -e 'my $i = 0;while(1) { stat("./binaries/system/pyrogenesys"); $i++; exit unless $i % 10_000_000; }'
10.47 sec

10kk tar stat
time perl -MArchive::Tar -e 'my $a = Archive::Tar->new("test.tar");while(1) { $a->contains_file("file"); $i++; exit unless $i % 10_000_000; }'
118.23 sec

 

I suppose linux does things differently too :)

Link to comment
Share on other sites

22 hours ago, nwtour said:

This 2.5 percent is offset by more efficient indexing in the file system.

If you change instead of reading - for access to file meta-information, then the file system will play out the time saved due to defragmentation

There is also a file system sector size which affects alignment of files (it's usually 4KiB on Windows).

I've tested on Windows few file sizes with 2 mods with the same list of files: unpacked and packed.

  • File size 480 bytes, unpacked costs +972% reading time than packed (total read ~4MiB).
  • File size 2333 bytes, unpacked costs +690% reading time than packed (total read ~80MiB).
  • File size 111333 bytes, unpacked costs +9.7% reading time than packed (total read ~1GiB).
  • File size 521111 bytes, unpacked costs -32% reading time than packed (total read ~1GiB).
Link to comment
Share on other sites

1 hour ago, vladislavbelov said:
  • File size 480 bytes, unpacked costs +972% reading time than packed (total read ~4MiB).
  • File size 2333 bytes, unpacked costs +690% reading time than packed (total read ~80MiB).
  • File size 111333 bytes, unpacked costs +9.7% reading time than packed (total read ~1GiB).
  • File size 521111 bytes, unpacked costs -32% reading time than packed (total read ~1GiB).

On a hard disk with a cache of 128 megabytes (WDC WD2005FBYZ), multi reading of 4 megabytes from from one file is very cheap compared to random files in the file system

If this was meant as a "contiguous memory", then I agree

Edited by nwtour
Link to comment
Share on other sites

31 minutes ago, nwtour said:

On a hard disk with a cache of 128 megabytes (WDC WD2005FBYZ), multi reading of 4 megabytes from from one file is very cheap compared to random files in the file system

If this was meant as a "contiguous memory", then I agree

4MiB are read in both cases: a) for unpacked they are read in a random order file by file with 480 bytes each b) for packed they are read in a random order from a ZIP archive file by file 480 bytes each. All tests have multiple prewarms (they read total size multiple times), so in theory I should have warm HDD caches.

Ideally I expect if I read multiple files with a total size less than a cache size then it should have similar time for unpacked and packed mods.

But as I mentioned a file system has own meta information and alignment which might increase a "reading cost" of small files.

Link to comment
Share on other sites

1 hour ago, vladislavbelov said:

4MiB are read in both cases: a) for unpacked they are read in a random order file by file with 480 bytes each b) for packed they are read in a random order from a ZIP archive file by file 480 bytes each. All tests have multiple prewarms (they read total size multiple times), so in theory I should have warm HDD caches.

Ideally I expect if I read multiple files with a total size less than a cache size then it should have similar time for unpacked and packed mods.

But as I mentioned a file system has own meta information and alignment which might increase a "reading cost" of small files.

There is a disk cache - it works with very high speed It loads the data around the requested seek()/read() system calls. This allows you to load it completely to the cache in case of requests to one file and receive data at the SATA-2 interface speed. https://www.alphr.com/what-is-hard-drive-cache/ (Block Reading Ahead and Behind)

You argue that both tests worked with a disk cache and an archive 10 times faster. It can not be

I brought the test that the meta-information from the file system is always faster than metainformation from the archive. The file system is an hot indexed file information database in the kernel space. And any archive - always has a primitive way on requests inside

Link to comment
Share on other sites

38 minutes ago, nwtour said:

You argue that both tests worked with a disk cache and an archive 10 times faster. It can not be

It might be hard to believe but it can :)

39 minutes ago, nwtour said:

The file system is an hot indexed file information database in the kernel space. And any archive - always has a primitive way on requests inside

It's true, but it has a tiny detail. Even the most powerful general filesystem is general, which means it has own tradeoffs. And it can't fit for all possible cases. The same for archives.

So the trick is that when we load a mod in pyrogenesis we cache a list of its files and store an offset for each of them. Which means we don't need to do an expensive system call for each file.

43 minutes ago, nwtour said:

I brought the test that the meta-information from the file system is always faster than metainformation from the archive.

I might only assume that in your case tar has a worse index.

Link to comment
Share on other sites

19 hours ago, vladislavbelov said:

So the trick is that when we load a mod in pyrogenesis we cache a list of its files and store an offset for each of them. Which means we don't need to do an expensive system call for each file.

Red card for unsportsmanlike conduct B)
This is obtained by re-implementation of the NoSQL database
.

Edited by nwtour
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...