brief: | i18n.site yanzu yana goyan bayan binciken cikakken rubutu mara sabar.

Wannan labarin yana gabatar da aiwatar da ingantaccen fasahar neman cikakken rubutu na gaba-gaba, gami da jujjuyawar fihirisar da IndexedDB ta gina, binciken prefix, haɓaka rarrabuwar kalmomi da tallafin harsuna da yawa.

Idan aka kwatanta da mafita da ake da su, i18n.site tsantsar bincike na gaba-karshen cikakken rubutu yana da ƙanƙanta cikin girma da sauri, dacewa da ƙanana da matsakaitan gidajen yanar gizo kamar takardu da shafukan yanar gizo, kuma ana samun su ta layi.


Neman Cikakken-Ƙarshen Gaba Mai Tsafta Wanda Aka Juyar Da Shi

Jeri

Bayan makonni & yawa na ci gaba markdown i18n.site

Wannan labarin zai raba aikin fasaha na i18n.site mai cikakken bincike na gaba-gaba i18n.site don sanin tasirin bincike.

Lambar : tushen kernel / mai mu'amala

Bayanin Hanyoyin Bincike Cikakken Rubutu Mara Sabar

Don ƙanana da matsakaita masu girman kai zalla tsayayyen gidajen yanar gizo kamar takardu/shafukan yanar gizo na sirri, gina bayanan bincike mai cikakken rubutu da kansa ya yi nauyi sosai, kuma binciken cikakken rubutu mara sabis shine zaɓi na gama gari.

Maganganun neman cikakken rubutu mara uwar garken sun faɗi cikin manyan rukunai biyu:

Na farko, irin wannan algolia.com

Irin waɗannan sabis ɗin suna buƙatar biyan kuɗi bisa ƙarar bincike, kuma galibi ba su samuwa ga masu amfani a babban yankin China saboda batutuwa kamar yarda da gidan yanar gizo.

Ba za a iya amfani da shi a layi ba, ba za a iya amfani da shi akan intanet ba, kuma yana da iyakacin iyaka. Wannan labarin bai tattauna da yawa ba.

Na biyu shine bincike mai cikakken rubutu na gaba-karshen.

A halin yanzu, gama-gari na gama-gari cikakken bincike na rubutu sun haɗa da lunrjs da ElasticLunr.js (dangane da ci gaban sakandare na lunrjs ).

lunrjs Akwai hanyoyi guda biyu don gina fihirisa, kuma dukansu suna da nasu matsalolin.

  1. Fayilolin fihirisa da aka riga aka gina

    Domin fihirisar ta ƙunshi kalmomi daga duk takardu, yana da girma a girman. A duk lokacin da aka ƙara ko gyara takarda, dole ne a loda sabon fayil ɗin fihirisa. Zai ƙara lokacin jiran mai amfani kuma yana cinye bandwidth mai yawa.

  2. Load da takardu kuma gina fihirisa akan tashi

    Gina fihirisar aiki ne mai tsananin ƙirƙira Sake gina fihirisar a duk lokacin da ka sami damar hakan zai haifar da ɓarna a fili da ƙarancin ƙwarewar mai amfani.


Baya ga lunrjs , akwai wasu cikakkun hanyoyin bincike na rubutu, kamar :

fusejs lissafta kamance tsakanin kirtani don bincika.

Ayyukan wannan maganin ba su da kyau sosai kuma ba za a iya amfani da su don neman cikakken rubutu ba (duba Fuse.js Dogon tambaya yana ɗaukar fiye da daƙiƙa 10 , yadda za a inganta shi? ).

TinySearch Yi amfani da tace Bloom don bincika, ba za a iya amfani da shi don binciken prefix ba (misali, shigar da goo , bincika good , google ), kuma ba za a iya cimma irin wannan sakamako ta atomatik ba.

Saboda gazawar hanyoyin da ake da su, i18n.site ya haɓaka sabon ingantaccen bayani na gaba-gaba mai cikakken rubutu, wanda ke da halaye masu zuwa :

  1. Yana goyan bayan binciken yaruka da yawa kuma yana da ƙananan girman girman kernel ɗin bayan fakitin gzip shine 6.9KB (don kwatanta, girman lunrjs shine 25KB ).
  2. Gina juzu'i mai jujjuyawa dangane da indexedb , wanda ke ɗaukar ƙarancin ƙwaƙwalwar ajiya kuma yana da sauri.
  3. Lokacin da aka ƙara / gyaggyara takardu, kawai ƙara ko gyare-gyaren takaddun da aka sake tsarawa, rage adadin ƙididdiga.
  4. Yana goyan bayan binciken prefix, wanda zai iya nuna sakamakon bincike a ainihin lokacin yayin da mai amfani ke bugawa.
  5. Akwai Layi

A ƙasa, i18n.site za a gabatar da cikakkun bayanan aiwatar da fasaha daki-daki.

Rarrabuwar Kalmomin Harsuna Da Yawa

Bangaren Kalma yana amfani da kalmar asali na mai lilo Intl.Segmenter , kuma duk manyan masu bincike suna goyan bayan wannan mahallin.

Kalmar Segmentation coffeescript code ita ce kamar haka

SEG = new Intl.Segmenter 0, granularity: "word"

seg = (txt) =>
  r = []
  for {segment} from SEG.segment(txt)
    for i from segment.split('.')
      i = i.trim()
      if i and !'|`'.includes(i) and !/\p{P}/u.test(i)
        r.push i
  r

export default seg

export segqy = (q) =>
  seg q.toLocaleLowerCase()

cikin:

Index Gini

An ƙirƙiri teburin ajiya abubuwa 5 a cikin IndexedDB :

Shiga cikin tsararrun daftarin aiki url da lambar sigar ver , kuma bincika ko takaddar tana cikin tebur doc Idan babu shi, ƙirƙirar fihirisar jujjuyawar. A lokaci guda, cire jujjuyar fihirisar waɗancan takaddun da ba a shigar dasu ba.

Ta wannan hanyar, ana iya samun ƙididdige yawan ƙididdiga kuma an rage adadin ƙididdiga.

A cikin hulɗar gaba-gaba, za a iya nuna ma'aunin ci gaba na ma'aunin nauyi / guje css raguwa lokacin da ake yin lodi a karon progress + .

IndexedDB Babban Rubutun Lokaci Guda

An idb aikin bisa ga asynchronous encapsulation na IndexedDB

IndexedDB yana karantawa kuma ya rubuta ba a daidaita su ba. Lokacin ƙirƙirar fihirisar, za a loda takardu a lokaci guda don ƙirƙirar fihirisar.

Don guje wa ɓarna bayanan ɓangarori da ke haifar da gasa ta rubutu, zaku iya komawa zuwa lambar coffeescript da ke ƙasa kuma ku ƙara cache ing tsakanin karantawa da rubutu don kutse rubutun gasa.

pusher = =>
  ing = new Map()
  (table, id, val)=>
    id_set = ing.get(id)
    if id_set
      id_set.add val
      return

    id_set = new Set([val])
    ing.set id, id_set
    pre = await table.get(id)
    li = pre?.li or []

    loop
      to_add = [...id_set]
      li.push(...to_add)
      await table.put({id,li})
      for i from to_add
        id_set.delete i
      if not id_set.size
        ing.delete id
        break
    return

rindexPush = pusher()
prefixPush = pusher()

Daidaitawa Da Tunawa

Binciken zai fara raba mahimman kalmomin da mai amfani ya shigar.

A ɗauka cewa akwai kalmomi N bayan ɓangaren kalmar Lokacin dawo da sakamako, za a fara dawo da sakamakon da ke ɗauke da duk mahimman kalmomi, sa'an nan kuma za a dawo da sakamakon da ke dauke da N-1 , N-2 ,..., 1 keywords.

Sakamakon binciken da aka nuna da farko yana tabbatar da daidaiton tambayar, kuma sakamakon da aka ɗora daga baya (danna ƙarin maɓalli) yana tabbatar da ƙimar tunawa.

Load Akan Buƙata

Domin inganta saurin amsawa, binciken yana amfani da janareta yield don aiwatar da lodin da ake buƙata, kuma yana dawowa limit lokacin da aka nemi sakamako.

Lura cewa duk lokacin da kuka sake bincika bayan yield , kuna buƙatar sake buɗe ma'amalar tambaya ta IndexedDB .

Prefix Bincike Na Ainihin Lokaci

Domin nuna sakamakon bincike yayin da mai amfani ke bugawa, misali, lokacin da aka shigar da wor , ana nuna kalmomin da aka riga aka yi da wor kamar words da work .

Kwayar bincike za ta yi amfani da tebur na prefix don kalma ta ƙarshe bayan rarrabuwar kalma don nemo duk kalmomin da aka riga aka kayyade tare da shi, da bincika a jere.

Hakanan ana amfani da aikin Anti-shake debounce a cikin hulɗar gaba-gaba (an aiwatar da shi kamar haka) don rage yawan shigar da mai amfani yana haifar da bincike da rage adadin ƙididdiga.

export default (wait, func) => {
  var timeout;
  return function(...args) {
    clearTimeout(timeout);
    timeout = setTimeout(func.bind(this, ...args), wait);
  };
}

Akwai Layi

Tebur mai ma'ana baya adana ainihin rubutun, kalmomi kawai, wanda ke rage adadin ajiya.

Haskaka sakamakon binciken yana buƙatar sake loda ainihin rubutun, kuma daidaita service worker na iya guje wa maimaita buƙatun cibiyar sadarwa.

A lokaci guda, saboda service worker yana adana duk labarin, da zarar mai amfani ya yi bincike, duk gidan yanar gizon, gami da binciken, ana samun su ta layi.

Nuna Ingantaccen Takaddun MarkDown

i18n.site 's tsantsar bincike na gaba-gaba an inganta shi don takaddun MarkDown .

Lokacin nuna sakamakon bincike, za a nuna sunan babin kuma za a kewaya babin idan an danna.

Takaita

Jujjuyawar binciken cikakken rubutu da aka aiwatar a ƙarshen gaba zalla, babu uwar garken da ake buƙata. Ya dace sosai don ƙananan gidajen yanar gizo da matsakaita kamar takardu da shafukan yanar gizo na sirri.

i18n.site Buɗe tushen bincike mai tsabta na gaba-gaba mai ƙima, ƙarami a cikin girma da amsa mai sauri, yana magance gazawar bincike mai cikakken rubutu na gaba-ƙarshen yanzu kuma yana ba da ingantaccen ƙwarewar mai amfani.