File management system, file management method, collection program, and non-transitory computer-readable information recording medium

Gespeichert in:
Bibliographische Detailangaben
Titel: File management system, file management method, collection program, and non-transitory computer-readable information recording medium
Patent Number: 11169,962
Publikationsdatum: November 09, 2021
Appl. No: 16/085435
Application Filed: March 17, 2016
Abstract: In a server (111), an updater (201) updates a file by an editing process that includes an adding process that adds a record to the end of a file. A collector (202) reads, in order of location in the file, a record included in the file, causes a collection device of a collection system to associate and collect the record and a position of the record in the file, and non-transitorily stores the position as an offset. An estimator (203) estimates whether header records located between the beginning of the file and the recorded offset are updated. When it is estimated that any of the header records are updated, a starter (204) causes the collector (202) to start reading the record from the beginning of the file. When it is estimated that none of the header records have been updated, the starter 204 causes the collector (202 to start reading the record from the recorded offset.
Inventors: Rakuten Group, Inc. (Tokyo, JP)
Assignees: Rakuten Group, Inc. (Tokyo, JP)
Claim: 1. A file management system, comprising: a server device that non-transitorily stores a file and an offset for the file; and a collection system, wherein (a) the server device executes an editing program, thereby functioning as an updater that updates the file by an editing process that includes an adding process that adds a record to an end of the file, (b) the server device executes a collection program, thereby functioning as a collector that, in a first execution of the collection program, reads, in order of location in the file, a record included in the file, causes the collection system to associate and collect the read record and a position where a beginning of the read record is located in the file, and updates the non-transitorily stored offset to a position where an end of the collected record is located in the file, and (c) the server device functions as an estimator that, based on a second execution, subsequent to the first execution, of the collection program being started, estimates whether any header records located between a beginning of the file and the non-transitorily stored offset are updated, and a starter that, in the second execution of the collection program, based on an estimation that any of the header records are updated, causes the collector to start reading the record of the file from the beginning of the file, and based on an estimation that none of the header records are updated, causes the collector to start reading the record of the file from the non-transitorily stored offset and to skip reading the record from the beginning of the file to the non-transitorily stored offset.
Claim: 2. The file management system according to claim 1 , wherein the collector acquires an extraction position between the beginning of the file and the non-transitorily stored offset, reads data located at the extraction position acquired from the file, calculates a hash value of the read data, and non-transitorily stores the acquired extraction position and the calculated hash value, and the starter acquires the non-transitorily stored extraction position, reads the data located at the extraction position acquired from the file, and calculates the hash value of the read data and, based on the calculated hash value being equivalent to the non-transitorily stored hash value, the estimator estimates that none of the header records are updated.
Claim: 3. The file management system according to claim 2 , wherein the collector randomly determines the extraction position between the beginning of the file and the non-transitorily stored offset.
Claim: 4. The file management system according to claim 3 , wherein the extraction position is randomly determined using random numbers having a probability distribution that attenuates from one of the non-transitorily stored offset and the beginning of the file to the other of the non-transitorily stored offset and the beginning of the file.
Claim: 5. The file management system according to claim 1 , wherein the collector acquires an extraction position uniquely associated with the non-transitorily stored offset, reads data located at the extraction position acquired from the file, calculates a hash value of the read data, and non-transitorily stores the calculated hash value, and the starter acquires the extraction position uniquely associated with the non-transitorily stored offset, reads the data located at the extraction position acquired from the file, and calculates a hash value of the read data and, based on the calculated hash value is being equivalent to the non-transitorily stored hash value, the estimator estimates that none of the header records are updated.
Claim: 6. The file management system according to claim 1 , wherein when a first record located at the beginning of the file is read by the collector and the first record is collected by the collection system, the collector calculates a hash value of the first record and non-transitorily stores the calculated hash value, and the starter calculates a hash value of the first record located at the beginning of the file and, in cases in which the calculated hash value is equivalent to the non-transitorily stored hash value, estimates that none of the header records are updated.
Claim: 7. The file management system according to claim 1 , wherein an estimation by the estimator is performed periodically or intermittently after the first execution of the collection program is started, and in cases in which, as a result of the periodically or the intermittently performed estimation, it is estimated that any of the header records are updated, the collector reads the record of the file again from the beginning of the file.
Claim: 8. The file management system according to claim 1 , wherein the file is restored by concatenating, in an order of the positions associated with the collected records, the records collected by the collection system.
Claim: 9. The file management system according to claim 1 , wherein the collection system includes a plurality of collection devices, each of the records collected by the collection system is stored in one of the plurality of collection devices, together with a position associated with each of the records, and when the collection system receives a query from a client terminal, each of the plurality of collection devices extracts a record satisfying the query from among the records stored therein, and responds to the client terminal with the extracted record.
Claim: 10. A file management method executed by a server device that non-transitorily stores a file and an offset for the file, and a collection system, the method comprising: (a) the server device executing an editing program, thereby updating the file by an editing process that includes an adding process that adds a record to an end of the file; (b) the server device executing a collection program, thereby, in a first execution of the collection program, reading, in order of location in the file, a record included in the file, causing the collection system to associate and collect the read record and a position where a beginning of the read record is located in the file, and updating the non-transitorily stored offset to a position where an end of the collected record is located in the file; and (c) based on a second execution, subsequent to the first execution, of the collection program being started, the server device estimating whether any header records located between a beginning of the file and the non-transitorily stored offset are updated and, in the second execution of the collection program, based on an estimation that any of the header records are updated, starting reading the record of the file from the beginning of the file, and based on an estimation that none of the header records are updated, starting reading the record of the file from the non-transitorily stored offset and skipping reading the record from the beginning of the file to the non-transitorily stored offset.
Claim: 11. A non-transitory computer-readable information recording medium on which a collection program is stored, the collection program being executable by a server device in a file management system, the file management system comprising the server device that non-transitorily stores a file and an offset for the file, and a collection system, wherein the collection program executed by the server device causes the server device to: function as a collector that, in a first execution of the collection program, reads, in order of location in the file, a record included in the file, causes the collection system to associate and collect the read record and a position where a beginning of the read record is located in the file, and updates the non-transitorily stored offset to a position where an end of the collected record is located in the file; function as an estimator that, based on a second execution, subsequent to the first execution, of the collection program being started, estimates whether any header records located between a beginning of the file and the non-transitorily stored offset are updated; and function as a starter that, in the second execution of the collection program, based on an estimation that any of the header records are updated, causes the collector to start reading the record of the file from the beginning of the file, and based on an estimation that none of the header records are updated, causes the collector to start reading the record of the file from the non-transitorily stored offset and to skip reading the record from the beginning of the file to the non-transitorily stored offset.
Patent References Cited: 5720026 February 1998 Uemura
5926821 July 1999 Hirose
6009502 December 1999 Boeuf
6339795 January 2002 Narurkar
6604236 August 2003 Draper
7277905 October 2007 Randal
7890469 February 2011 Maionchi
8032009 October 2011 Furuta
8135676 March 2012 Poojary
8601225 December 2013 Atluri
8666944 March 2014 Beatty
8694458 April 2014 Kim
9020987 April 2015 Nanda
9171002 October 2015 Mam
9201906 December 2015 Kumarasamy
9251186 February 2016 Muller
9286165 March 2016 Hwang
9311333 April 2016 Pawar
9442955 September 2016 Pawar
9449007 September 2016 Wood
9460177 October 2016 Pawar
9633022 April 2017 Vijayan
9785510 October 2017 Madhavarapu
10250446 April 2019 Prasad
10380141 August 2019 Chepel
10514985 December 2019 Patwardhan
10567500 February 2020 Leshinsky
10776209 September 2020 Pawar
2008/0215834 September 2008 Dumitru
2012/0246125 September 2012 Kato
2016/0210194 July 2016 Kumarasamy
2018/0165345 June 2018 Nomura
2014-81898 May 2014
Assistant Examiner: Rayyan, Susan F
Primary Examiner: Beausoliel, Jr., Robert W
Attorney, Agent or Firm: Sughrue Mion, PLLC
Dokumentencode: edspgr.11169962
Datenbank: USPTO Patent Grants