matching-rh-peft3 / README.md
gguichard's picture
Add new SentenceTransformer model
fa7184f verified
|
Raw
History Blame Contribute Delete
57.5 kB
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:297400
  - loss:CosineSimilarityLoss
base_model: EuroBERT/EuroBERT-210m
widget:
  - source_sentence: >-
      {"type": "opportunity", "customer_code": "", "opportunity_title":
      "#MakeReal#Data - Expertise GCP à la demande", "opportunity_place": "",
      "opportunity_expertise_area": "-1", "opportunity_tools": "",
      "opportunity_activity_area": "", "opportunity_type": "1",
      "opportunity_description": "", "opportunity_criteria": "",
      "opportunity_extract": 1}
    sentences:
      - >-
        {"type": "candidate", "customer_code": "", "title": "CONTROLEUSE DE
        GESTION SENIOR/RAF", "skills": "", "education": "", "experience": "-1",
        "tools": "", "languages": "", "mobility": "", "expertise_area": "",
        "activity_area": "", "list_diplomes": "2006 - Master 1 Maîtrise de
        Sciences Economiques et de Gestion - Marne La Vallée 77 - not provided",
        "typeOf": "-1", "source": "-1", "informationComments": "", "extract": 1,
        "experiences": "[{'skills': '', 'startMonth': '', 'endDate': '',
        'startYear': '', 'description': \"service Cotation en charge de
        L'analyse de la rentabilité, de la solvabilité et de l'autonomie
        financière des entreprises L'établissement du diagnostic financier\",
        'company': '', 'location': '', 'id': '23447', 'title': 'Assistante -
        BANQUE DE FRANCE - 01/01/1994 - 01/01/1994', 'endMonth': '', 'endYear':
        '', 'startDate': ''}, {'skills': '', 'startMonth': '', 'endDate': '',
        'startYear': '', 'description': '', 'company': '', 'location': '', 'id':
        '23448', 'title': 'SUDAC Air Service Groupe - AIR LIQUIDE - 01/01/2007 -
        01/01/2008', 'endMonth': '', 'endYear': '', 'startDate': ''}, {'skills':
        '', 'startMonth': '', 'endDate': '', 'startYear': '', 'description':
        \"Responsable du contrôle de gestion et comptabilité auxiliaire charge
        de L'analyse et le suivi de la rentabilité de trois sociétés et de leurs
        portefeuilles clients La collecte la consolidation et validation de
        tableau de bords pour la production La consolidation de données
        financières pour le suivi budgétaire (mensuel/annuel L'analyse des
        écarts entre le réalisé et le Budgété 6 Rue du Centre 91 Essonne Tel :
        06-29-46-98-74 Mail : ketty58_9@hotmail.com Permis B + Véhicule
        Téléchargé par TEOLIA (111069) le 06/01/2022 14:10:20 Le suivi du
        process facturation (comptabilité clients et fournisseurs)
        L'établissement des rapprochements avec l'expert-comptable et de la
        clôture La trésorerie et du management de trois (3) personnes\",
        'company': '', 'location': '', 'id': '23449', 'title': 'Kéthia MICHEL -
        01/01/1994 - 01/01/1994', 'endMonth': '', 'endYear': '', 'startDate':
        ''}, {'skills': '', 'startMonth': '', 'endDate': '', 'startYear': '',
        'description': 'logistique de 13 M€ CA/an et effectifs 70 p) - Ivry sur
        Seine', 'company': '', 'location': '', 'id': '23450', 'title': 'AXELIS+
        Société - 01/01/2009 - 01/01/2016', 'endMonth': '', 'endYear': '',
        'startDate': ''}, {'skills': '', 'startMonth': '', 'endDate': '',
        'startYear': '', 'description': 'informatique de 44 M€/an effectifs 650
        p) -St Denis', 'company': '', 'location': '', 'id': '23451', 'title':
        'LINKBYNET Société - 01/01/2016 - 01/01/2017', 'endMonth': '',
        'endYear': '', 'startDate': ''}, {'skills': '', 'startMonth': '',
        'endDate': '', 'startYear': '', 'description': \"conseil) -Paris (08) 1
        an Contrôleuse de gestion IT détachée chez BPCE-IT et en charge de : La
        reprise de leur modèle de facturation L'amélioration de leur modèle de
        facturation L'analyse entre les coûts réels et les coûts Budgétés
        L'analyse des écarts entre le réalisé et le Budgété\", 'company': '',
        'location': '', 'id': '23452', 'title': 'RHAPSODIES Société - 01/01/2017
        - 01/01/2018', 'endMonth': '', 'endYear': '', 'startDate': ''},
        {'skills': '', 'startMonth': '', 'endDate': '', 'startYear': '',
        'description': \"1 an Consultante en contrôle de gestion en charge de :
        La construction du PL Mise en place de tableau de bord du suivi de la
        productivité Mise en place d'indicateur pour le service de facturation
        Construction d'un budget sur 3 ans L'analyse entre le budgété et le
        réalisé Support à l'amélioration des process de facturation et
        comptabilité fournisseurs Support à l'amélioration des enregistrements
        analytiques et comptables\", 'company': '', 'location': '', 'id':
        '23453', 'title': 'CONSULTANTE - EN CONTROLE DE GESTION - 01/01/2018 -
        01/01/2019', 'endMonth': '', 'endYear': '', 'startDate': ''}, {'skills':
        '', 'startMonth': '', 'endDate': '', 'startYear': '', 'description':
        \"en contrôle de gestion détachée chez GRT GAZ en charge de :
        L'évolution des coûts et du suivi budgétaire L'évolution des OPEX/CAPEX
        La mise en place de tableau de bord L'accompagnement des chefs de projet
        et portefolio dans leur suivi de projet et portefeuille La construction
        du budget annuel La construction du reporting trimestriel et annuel\",
        'company': '', 'location': '', 'id': '23454', 'title': 'Consultante -
        FAO CONSULTING (société de conseil) - Levallois Perret - 01/03/2020',
        'endMonth': '', 'endYear': '', 'startDate': ''}]"}
      - >-
        {"type": "candidate", "customer_code": "", "title": "", "skills": "CAO,
        Construction, GESTION, IBM CATIA, IBM CATIA Version 5, Marketing
        Management, Microsoft, Microsoft Excel, Microsoft PowerPoint, Microsoft
        Word, Pricing, RAID", "education": "", "experience": "0", "tools": "",
        "languages": "", "mobility": "", "expertise_area": "", "activity_area":
        "commercial", "list_diplomes": "DUT - ET COMPETENCES - not provided -
        1999, DUT - Génie électrique - Université J. Fourier à Grenoble, BAC S -
        Génie Mécanique et Productique - Université J. Fourier à Grenoble -
        1996, DUT - Option technologies industrielles - Lycée Vaucanson à
        Grenoble - 1999, DUT - Génie électrique - Université J. Fourier à
        Grenoble", "typeOf": "-1", "source": "7", "informationComments": "",
        "extract": 1, "experiences": "[{'description': 'Trucks Commercial
        Vehicle\\r\\n', 'title': 'Manager Marketing véhicules Construction -
        Renault'}, {'description': \"- Saint-Priest (69) Responsable de
        l'animation Marketing de la gamme Construction * Réalisation des
        plateformes Marketing intégrant le contenu de l'offre, l'argumentation
        commerciale et l'analyse concurrence * Création d'ateliers de
        présentation des véhicules adaptés aux différents marchés internationaux
        * Organisation d'événements promotionnels et présentations clients et
        journalistes * Analyse trimestrielle des ventes par modèles et
        définition d'actions marketing et pricing * Conception des cahiers des
        charges formations commerciales Manager Marketing gamme lourde - Renault
        Trucks International\\r\\n\", 'title': 'Groupe AB Volvo - De -
        01/01/2012'}, {'description': 'Trucks Commercial Vehicle\\r\\n',
        'title': 'Manager Marketing véhicules Construction - Renault -
        01/02/2000'}, {'description': \"- Saint-Priest (69) Responsable de
        l'animation Marketing de la gamme Construction * Réalisation des
        plateformes Marketing intégrant le contenu de l'offre, l'argumentation
        commerciale et l'analyse concurrence * Création d'ateliers de
        présentation des véhicules adaptés aux différents marchés internationaux
        * Organisation d'événements promotionnels et présentations clients et
        journalistes * Analyse trimestrielle des ventes par modèles et
        définition d'actions marketing et pricing * Conception des cahiers des
        charges formations commerciales Manager Marketing gamme lourde - Renault
        Trucks International\\r\\n\", 'title': 'Groupe AB Volvo - De -
        01/01/2012'}]"}
      - >-
        {"type": "candidate", "customer_code": "", "title": "", "skills": "",
        "education": "", "experience": "-1", "tools": "", "languages": "",
        "mobility": "", "expertise_area": "", "activity_area": "",
        "list_diplomes": "", "typeOf": "0", "source": "", "informationComments":
        "", "extract": 1, "experiences": "[]"}
  - source_sentence: >-
      {"type": "opportunity", "customer_code": "", "opportunity_title":
      "Chargé(e) de recrutement - CDD - remplacement VLE", "opportunity_place":
      "", "opportunity_expertise_area": "services", "opportunity_tools": "",
      "opportunity_activity_area": "", "opportunity_type": "5",
      "opportunity_description": "", "opportunity_criteria": "",
      "opportunity_extract": 1}
    sentences:
      - >-
        {"type": "candidate", "customer_code": "", "title": "CHARGEE DE
        RECRUTEMENT", "skills": "spontanée, enthousiaste, souhaite intégrer,
        adaptabilité, optimisme, analytique, travail en équipe\n\npack office
        365, cegid, dpae, sirh, cp, stc", "education": "", "experience": "-1",
        "tools": "", "languages": "anglais", "mobility": "", "expertise_area":
        "", "activity_area": "", "list_diplomes": "2018 - BTS Management des
        unités commerciales - Institution Robin Vienne, 2015 - LICENCE 1
        Droit-Science Politique - Université Lumière Lyon II", "typeOf": "-1",
        "source": "7", "informationComments": "", "extract": 1, "experiences":
        "[{'skills': '', 'startMonth': '', 'endDate': '', 'startYear': '',
        'description': 'identification de pharmacovigilance\\ninformation et
        suivi qualité', 'company': '', 'location': '', 'id': '31909', 'title':
        'Chargée de clientèle pharmaceutique - WEBHELP MEDICA - Lyon - 09/2019 -
        12/2019', 'endMonth': '', 'endYear': '', 'startDate': ''}, {'skills':
        '', 'startMonth': '', 'endDate': '', 'startYear': '', 'description':
        'identification des sollicitations locataire\\nsuivi des procédures
        spécifiques\\ngestion des rendez-vous & transfert d’appel', 'company':
        '', 'location': '', 'id': '31910', 'title': 'Conseillère clientèle
        sociale - LYON METROPOLE HABITAT - Lyon - 01/2020 - 06/2020',
        'endMonth': '', 'endYear': '', 'startDate': ''}, {'skills': '',
        'startMonth': '', 'endDate': '', 'startYear': '', 'description':
        'participation au lancement du cdi apprenant\\ngestion de
        projet\\nentretien visio-conférences\\nrédaction et publication
        d’annonces\\nsourcing\\ntraitement des candidatures\\nrecrutement
        volumique\\ncoaching et conseils candidats\\nfeed back entretiens',
        'company': '', 'location': '', 'id': '31911', 'title': 'Chargée de
        Recrutement - THE ADECCO GROUP - Villeurbanne - 11/2020 - 07/2021',
        'endMonth': '', 'endYear': '', 'startDate': ''}, {'skills': '',
        'startMonth': '', 'endDate': '', 'startYear': '', 'description':
        'evaluation des besoins hebdomadaires\\npublication
        d’offre\\npré-qualification téléphonique\\nanimation de session
        collective\\nentretien individuel\\ngestion des affectations\\nrédaction
        des contrats de travail\\ncréation de dossier administratif\\ngestion
        des visites médicales\\ntraitement des absences\\nsollicitations
        candidats\\nprocédure disciplinaire\\nsuivie des démissions', 'company':
        '', 'location': '', 'id': '31912', 'title': 'Chargée des Ressources
        Humaines - STAR SERVICE - 04/2022 - 11/2022', 'endMonth': '', 'endYear':
        '', 'startDate': ''}, {'skills': '', 'startMonth': '', 'endDate': '',
        'startYear': '', 'description': 'mise à jour du reporting
        rh/intérimaires\\nsaisis des éléments de paie\\nsuivi et évaluation des
        plans de compétences\\nrecrutement interne\\ncréation des dossiers
        rh\\naccueil des nouveaux arrivants /circuit d’intégration\\ngestion de
        la relation école-entreprise', 'company': '', 'location': '', 'id':
        '31913', 'title': 'Assistante RH - EGA CORBAS - 06/2022 - 01/2023',
        'endMonth': '', 'endYear': '', 'startDate': ''}]"}
      - >-
        {"type": "candidate", "customer_code": "", "title": "", "skills":
        "ANGLAIS, Automotive, Baan, Back Office, Bts (Comm Sw), Business
        Analysis, Business Analyst, Configure, Customer Relationship Management,
        Data Encryption Standard, Enterprise Requirements Planning, EP,
        Français, HP, IBM AS/400, Italien, Microsoft, Microsoft Visual Basic for
        Applications, Microsoft Windows CE, Movex, SalesForce, SAP, SAP MM
        module, SD, Structured Query Language, Test", "education": "",
        "experience": "-1", "tools": "", "languages": "", "mobility": "",
        "expertise_area": "industrie automobile", "activity_area":
        "informatique", "list_diplomes": "", "typeOf": "-1", "source": "0",
        "informationComments": "", "extract": 1, "experiences": "[]"}
      - >-
        {"type": "candidate", "customer_code": "", "title": "Agile Backend
        Developer", "skills": "", "education": "", "experience": "-1", "tools":
        "", "languages": "", "mobility": "", "expertise_area": "",
        "activity_area": "", "list_diplomes": "", "typeOf": "0", "source": "",
        "informationComments": "", "extract": 1, "experiences": "[]"}
  - source_sentence: >-
      {"type": "opportunity", "customer_code": "", "opportunity_title":
      "Développeur fullstack orienté Front", "opportunity_place": "",
      "opportunity_expertise_area": "-1", "opportunity_tools": "",
      "opportunity_activity_area": "", "opportunity_type": "1",
      "opportunity_description": "", "opportunity_criteria": "",
      "opportunity_extract": 1}
    sentences:
      - >-
        {"type": "candidate", "customer_code": "", "title": "Développeur React
        JS React Native Node Js", "skills": "", "education": "Bac5",
        "experience": "1", "tools": "", "languages": "", "mobility": "",
        "expertise_area": "", "activity_area": "", "list_diplomes": "Efrei
        Paris, Diplôme d'ingénieur", "typeOf": "-1", "source": "3",
        "informationComments": "", "extract": 1, "experiences": "[]"}
      - >-
        {"type": "candidate", "customer_code": "", "title": "", "skills":
        "Automation, Fabrication, GESTION", "education": "", "experience": "-1",
        "tools": "", "languages": "", "mobility": "", "expertise_area": "",
        "activity_area": "profilautres", "list_diplomes": "2016 - Bac + 2 -
        V.A.E Assistante de Gestion niveau III - not provided, CENTRE D'INTÉRÊT
        - Bénévole aux seins d'associations", "typeOf": "-1", "source": "7",
        "informationComments": "", "extract": 1, "experiences":
        "[{'description': 'du groupe et du siège) - Cuisines AVIVA * * * 2011 :
        Assistante commerciale EUROTHERM Automation : Développement de produits
        et systèmes ARGAL : Distributeur de charcuterie espagnole *\\r\\n',
        'title': 'PARCOURS MAJORITAIREMENT EFFECTUÉ EN INTÉRIM COMMERCIALE /
        ASSISTANTE - Assistante polyvalente et juridique gestion en binôme des
        quinze magasins - 01'}, {'description': \"Gestion / analyse des
        compteurs d'eaux / chauffages 16 ANS D'EXPÉRIENCES *\\r\\n\", 'title':
        'Assistante de gestion - OCEA / Attaché clientèle - APRIL : Assurances
        Carrefour - Suez - 01/01/2009 - 01/01/2009'}, {'description':
        'Fabrication équipements électriques *\\r\\n', 'title': 'Assistante
        commerciale - COMECA SYSTEMES - 01/01/2008 - 01/01/2009'},
        {'description': \"Sté d'assainissement / transport *\\r\\n\", 'title':
        'SANEST Suez - 01/01/2007 - 01/01/2008'}, {'description': \"la Formation
        Automobile ADAPTABILITÉ RAPIDE * 2002 : - ALPHA : Groupement de magasin
        de jardinage DOTÉE D'UN BON * 2006 : Assistante de gestion - B.M.S :
        Spécialiste de chariot élévateur assistante S.A.V RELATIONNEL ET SENS *
        2001 : - SOLYFONTE : Fondeur d'or\\r\\n\", 'title': 'Gpment Nationale -
        GNFA - 01/01/2007 - 01/01/2007'}]"}
      - >-
        {"type": "candidate", "customer_code": "", "title": "Cloud Consultant",
        "skills": "", "education": "", "experience": "-1", "tools": "",
        "languages": "", "mobility": "", "expertise_area": "", "activity_area":
        "", "list_diplomes": "", "typeOf": "0", "source": "",
        "informationComments": "", "extract": 1, "experiences": "[]"}
  - source_sentence: >-
      {"type": "opportunity", "customer_code": "", "opportunity_title":
      "#MakeReal #Software #Fullstack #Java #Angular", "opportunity_place": "",
      "opportunity_expertise_area": "Edition de logiciels", "opportunity_tools":
      "", "opportunity_activity_area": "", "opportunity_type": "1",
      "opportunity_description": "Le contexte général est le suivant :\n\n-
      Front-End Angular 13 et évolution vers du Micro-FrontEnd\n- Réécriture du
      back-end en Domaine Driven Design / Micro-service Back-End (Java /
      Spring.boot),\n- Intégration continue (Jenkins, Sonar, Nexus …) et
      déploiement via Docker et Amazon AWS\n- Amazon Web Services (EC2, RDS,
      Polly ….)\n- Oracle, SQLServer, Postgres\n- JUnit / NUnit …, Cucumber, et
      Selenium pour la partie tests d’intégration", "opportunity_criteria": "",
      "opportunity_extract": 1}
    sentences:
      - >-
        {"type": "candidate", "customer_code": "", "title": "Développeur
        fullstack", "skills": "", "education": "", "experience": "-1", "tools":
        "", "languages": "", "mobility": "", "expertise_area": "",
        "activity_area": "", "list_diplomes": "", "typeOf": "0", "source": "",
        "informationComments": "", "extract": 1, "experiences": "[]"}
      - >-
        {"type": "candidate", "customer_code": "", "title": "Agile Enterprise
        Architect", "skills": "", "education": "", "experience": "-1", "tools":
        "", "languages": "", "mobility": "", "expertise_area": "",
        "activity_area": "", "list_diplomes": "", "typeOf": "0", "source": "",
        "informationComments": "", "extract": 1, "experiences": "[]"}
      - >-
        {"type": "candidate", "customer_code": "", "title": "TECHNICIEN
        (pharma)", "skills": "Bts (Comm Sw), C Programming Language, Data
        Encryption Standard, ELISA, Fabrication, High Performance Liquid
        Chromatography  (HPLC), Oracle, SAP", "education": "Bac3", "experience":
        "10", "tools": "", "languages": "", "mobility": "Lyon",
        "expertise_area": "industrie chimique, industrie pharmaceutique",
        "activity_area": "profiltechnicien", "list_diplomes": "2012 - Licence -
        Bio Analyses et Contrôles - Lycée de la, option physiologie - Université
        de la Doua", "typeOf": "-1", "source": "-1", "informationComments": "",
        "extract": 1, "experiences": "[{'skills': '', 'startMonth': '',
        'endDate': '', 'startYear': '', 'description': \"le service Amiante *
        Observation macroscopique et définition de la filière de traitement de
        l'échantillon * Préparation d'échantillons (broyage, filtration,
        dissolution ) pour déterminer s'il y a présence d'amiante. Technicien
        d'analyse sur CPG et HPLC\", 'company': '', 'location': '', 'id':
        '16953', 'title': 'Technicien de production - Carso - Vénissieux (69) -
        01/06/2015 - 01/12/2015', 'endMonth': '', 'endYear': '', 'startDate':
        ''}, {'skills': '', 'startMonth': '', 'endDate': '', 'startYear': '',
        'description': \"Sanofi Pasteur - Marcy-l'Étoile (69) pour le service
        Manufacturing-Technologies * Prise de contact avec les demandeurs
        d'analyses et réception d'échantillons. * Contrôle physico-chimique des
        excipients et vaccins et écriture de rapports d'analyses * Rédaction du
        Mode Opératoire Normalisé des O2 mètre et CO2 mètre\", 'company': '',
        'location': '', 'id': '16954', 'title': 'Technicien de laboratoire -
        01/02/2016 - 01/12/2016', 'endMonth': '', 'endYear': '', 'startDate':
        ''}, {'skills': '', 'startMonth': '', 'endDate': '', 'startYear': '',
        'description': \"en zone à atmosphère contrôlée de Classe C *
        Réalisation des opérations de production dans le respect des exigences
        réglementaires (BPF / cGMP) * Mise à jour des documents Qualité (Cahier
        de salle, Fiche de suivi d'équipement, SCADA, MES et SAP) * Utilisation
        des appareils de mesure (pH, résistivité, test d'étanchéité,
        conductivité. .) et autoclave\", 'company': '', 'location': '', 'id':
        '16955', 'title': 'Technicien de production - Sanofi Genzyme - Gerland
        (69) - 01/01/2017 - 01/12/2017', 'endMonth': '', 'endYear': '',
        'startDate': ''}]"}
  - source_sentence: >-
      {"type": "opportunity", "customer_code": "", "opportunity_title": ".NET
      Developer", "opportunity_place": "", "opportunity_expertise_area":
      "Autres", "opportunity_tools": "", "opportunity_activity_area": "",
      "opportunity_type": "1", "opportunity_description": ".NET\nReact",
      "opportunity_criteria": "", "opportunity_extract": 1}
    sentences:
      - >-
        {"type": "candidate", "customer_code": "", "title": "Agile Back end 
        Developer", "skills": "", "education": "", "experience": "-1", "tools":
        "", "languages": "", "mobility": "", "expertise_area": "",
        "activity_area": "", "list_diplomes": "", "typeOf": "0", "source": "",
        "informationComments": "", "extract": 1, "experiences": "[]"}
      - >-
        {"type": "candidate", "customer_code": "", "title": "Consultant Data",
        "skills": "", "education": "", "experience": "-1", "tools": "",
        "languages": "", "mobility": "mondeeuropefrancerhonealpes",
        "expertise_area": "", "activity_area": "", "list_diplomes": "",
        "typeOf": "-1", "source": "3", "informationComments": "pas à l'écoute",
        "extract": 1, "experiences": "[]"}
      - >-
        {"type": "candidate", "customer_code": "", "title": "INGENIEUR D'ETUDES
        ET DEVELOPPEMENT", "skills": "", "education": "", "experience": "-1",
        "tools": "", "languages": "", "mobility": "", "expertise_area": "",
        "activity_area": "", "list_diplomes": "", "typeOf": "0", "source": "",
        "informationComments": "", "extract": 1, "experiences": "[]"}
datasets:
  - gguichard/matching_RH_train10
  - gguichard/matching_RH_val10
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on EuroBERT/EuroBERT-210m

This is a sentence-transformers model finetuned from EuroBERT/EuroBERT-210m on the matching_rh_train10 dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: EuroBERT/EuroBERT-210m
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'EuroBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("gguichard/matching-rh-peft3")
# Run inference
sentences = [
    '{"type": "opportunity", "customer_code": "", "opportunity_title": ".NET Developer", "opportunity_place": "", "opportunity_expertise_area": "Autres", "opportunity_tools": "", "opportunity_activity_area": "", "opportunity_type": "1", "opportunity_description": ".NET\\nReact", "opportunity_criteria": "", "opportunity_extract": 1}',
    '{"type": "candidate", "customer_code": "", "title": "Agile Back end  Developer", "skills": "", "education": "", "experience": "-1", "tools": "", "languages": "", "mobility": "", "expertise_area": "", "activity_area": "", "list_diplomes": "", "typeOf": "0", "source": "", "informationComments": "", "extract": 1, "experiences": "[]"}',
    '{"type": "candidate", "customer_code": "", "title": "Consultant Data", "skills": "", "education": "", "experience": "-1", "tools": "", "languages": "", "mobility": "mondeeuropefrancerhonealpes", "expertise_area": "", "activity_area": "", "list_diplomes": "", "typeOf": "-1", "source": "3", "informationComments": "pas à l\'écoute", "extract": 1, "experiences": "[]"}',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8869, 0.1913],
#         [0.8869, 1.0000, 0.2530],
#         [0.1913, 0.2530, 1.0000]])

Training Details

Training Dataset

matching_rh_train10

  • Dataset: matching_rh_train10 at 601ef4d
  • Size: 297,400 training samples
  • Columns: label, sentence1, and sentence2
  • Approximate statistics based on the first 1000 samples:
    label sentence1 sentence2
    type float string string
    details
    • min: 0.0
    • mean: 0.81
    • max: 1.0
    • min: 82 tokens
    • mean: 326.44 tokens
    • max: 1277 tokens
    • min: 95 tokens
    • mean: 1200.82 tokens
    • max: 6900 tokens
  • Samples:
    label sentence1 sentence2
    1.0 {"type": "opportunity", "customer_code": "", "opportunity_title": "SIENNA - DEV DOT NET", "opportunity_place": "", "opportunity_expertise_area": "Banque", "opportunity_tools": "", "opportunity_activity_area": "", "opportunity_type": "1", "opportunity_description": "", "opportunity_criteria": "", "opportunity_extract": 1} {"type": "candidate", "customer_code": "", "title": "Consultant Sénior Microsoft .NET", "skills": "", "education": "", "experience": "-1", "tools": "", "languages": "", "mobility": "", "expertise_area": "", "activity_area": "", "list_diplomes": "2007 - Master Management des projets informatiques et systèmes d'information, 2004 - Filière Informatique et Réseaux - ENSICAEN", "typeOf": "1", "source": "1", "informationComments": "", "extract": 1, "experiences": "[{'skills': '', 'startMonth': '6', 'endDate': '', 'startYear': '2004', 'description': 'AUTRES MISSIONS\nA\nIngénieur Conception et développement CALCIA\nAnalyste - Responsable d’applications chez EDF\nIngénieur Conception et Développement chez EDF\nIngénieur Conception et Développement chez BNPPARIBAS', 'company': 'AUTRES MISSIONS', 'location': '', 'id': '2536', 'title': 'Ingénieur Conception et développement', 'endMonth': '11', 'endYear': '2008', 'startDate': ''}, {'skills': '.net, .net 2.0, asp.net, c#, front office, gamaweb...
    1.0 {"type": "opportunity", "customer_code": "", "opportunity_title": "Consultant Mainframe - DGFIP - ONEPOINT", "opportunity_place": "", "opportunity_expertise_area": "Autres", "opportunity_tools": "", "opportunity_activity_area": "", "opportunity_type": "1", "opportunity_description": "", "opportunity_criteria": "", "opportunity_extract": 1} {"type": "candidate", "customer_code": "", "title": "Ingénieur de développement\nPACBASE/COBOL/MAINFRAME\n2 ans et ½ d’expérience", "skills": "", "education": "", "experience": "-1", "tools": "", "languages": "français, anglais", "mobility": "mondeeuropefranceiledefranceparis, mondeeuropefranceiledefranceseineetmarne, mondeeuropefranceiledefranceyvelines, mondeeuropefranceiledefranceessone, mondeeuropefranceiledefrancehautsdeseine92, mondeeuropefranceiledefranceseinesaintdenis, mondeeuropefranceiledefrancevaldemarne, mondeeuropefranceiledefrancevaloise", "expertise_area": "", "activity_area": "", "list_diplomes": "2018 - Formation PACBASE - Banque Populaire Dijon, 2018 - Formation Cobol en alternance appliqué au contexte Descours & Cabaud - Alteca Lyon et Informatique, 2018 - Formation interne VBA EXCEL, 2018 - Formation Mainframe IBM/COBOL et Qualification logiciel - INTI Formation, 2016 - Master international Science de la matière - Université de Rouen", "typeOf": "1", "source": "3",...
    1.0 {"type": "opportunity", "customer_code": "", "opportunity_title": "STIME responsable application adjoint", "opportunity_place": "", "opportunity_expertise_area": "Grande distribution", "opportunity_tools": "", "opportunity_activity_area": "", "opportunity_type": "1", "opportunity_description": "", "opportunity_criteria": "", "opportunity_extract": 1} {"type": "candidate", "customer_code": "", "title": "Consultant AMOA- Chef de projet SI", "skills": "", "education": "", "experience": "-1", "tools": "", "languages": "anglais, espagnol", "mobility": "", "expertise_area": "", "activity_area": "", "list_diplomes": "2020 - CERTYOU Paris, 2019 - Certification SCRUM Master - Actinuum Paris, 2017 - Cycle Project Management Professional V5 PMP, 2015 - Urbanisation et architecture SI, 2014 - ITIL Fondation", "typeOf": "1", "source": "1", "informationComments": "", "extract": 1, "experiences": "[{'skills': 'crm, oracle parties, mep, dba, infrastructure, crm people soft, uml, power amc, sql query, oracle, hp quality', 'startMonth': '4', 'endDate': '', 'startYear': '2007', 'description': 'INWI\nà\nSynthèse :\nParticipation à la mise en place du CRM pepoleSoft Oracle parties : vue 360°\nclient , facture et réclamations.\nRôle :\nConsultant AMOA homologation\nRéalisation :\n\uf0b7\nCollecte de besoin métier.\n\uf0b7\nRédaction de spéc...
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

matching_rh_val10

  • Dataset: matching_rh_val10 at 16fd0da
  • Size: 17,380 evaluation samples
  • Columns: label, sentence1, and sentence2
  • Approximate statistics based on the first 1000 samples:
    label sentence1 sentence2
    type float string string
    details
    • min: 0.0
    • mean: 0.84
    • max: 1.0
    • min: 80 tokens
    • mean: 352.97 tokens
    • max: 3661 tokens
    • min: 90 tokens
    • mean: 615.01 tokens
    • max: 6579 tokens
  • Samples:
    label sentence1 sentence2
    1.0 {"type": "opportunity", "customer_code": "", "opportunity_title": "DATA MANAGER - La POSTE", "opportunity_place": "", "opportunity_expertise_area": "Services", "opportunity_tools": "", "opportunity_activity_area": "", "opportunity_type": "1", "opportunity_description": "", "opportunity_criteria": "", "opportunity_extract": 1} {"type": "candidate", "customer_code": "", "title": "Senior Consultant/Project Manager - Data Management", "skills": "", "education": "", "experience": "-1", "tools": "", "languages": "", "mobility": "", "expertise_area": "", "activity_area": "", "list_diplomes": "BACHELOR - Mathématiques Appliquées - stratégique Université Paris I Panthéon Sorbonne, DEUG - Option Statistique - stratégique Université Paris I Panthéon Sorbonne", "typeOf": "-1", "source": "1", "informationComments": "adresse perso consultant : 99 rue Alfred DININ 92000 Nanterre", "extract": 1, "experiences": "[{'skills': '', 'startMonth': '', 'endDate': '', 'startYear': '', 'description': "Avril ❖Mission : * Automatisation et fiabilisation des calculs de l'inventaire de réassurance sur les produits de prévoyance individuelle commercialisés par les partenaires d'Axa France (SAS/SQL) * Etude de l'efficience et de la rentabilité des traités de réassurance mis en place pour sécuriser le portefeuille de ces produits (SAS/C++...
    1.0 {"type": "opportunity", "customer_code": "", "opportunity_title": "BABILOU - Responsable infra", "opportunity_place": "", "opportunity_expertise_area": "Autres", "opportunity_tools": "", "opportunity_activity_area": "", "opportunity_type": "1", "opportunity_description": "", "opportunity_criteria": "", "opportunity_extract": 1} {"type": "candidate", "customer_code": "", "title": "CHEF DE PROJET INFRASTRUCTURE", "skills": "", "education": "", "experience": "-1", "tools": "", "languages": "", "mobility": "", "expertise_area": "", "activity_area": "", "list_diplomes": "2020 - Microsoft Azure Artificial Intelligence - Microsoft Azure Fundamentals, 2014 - DEA - Probabilités et Applications - Université, 2003 - Diplôme d'ingénieur - Télécoms ENST ParisTech, 2003 - DEA - Signal et Communications Numériques - Université de Nice Sophia-Antipolis", "typeOf": "-1", "source": "1", "informationComments": "", "extract": 1, "experiences": "[{'skills': '', 'startMonth': '', 'endDate': '', 'startYear': '', 'description': '23 mois Études, architecture, ingénierie et paramétrage des réseaux de signalisation et de transit', 'company': '', 'location': '', 'id': '1947', 'title': 'Ingénieur accès fixe et mobile - Contexte - 01/10/2005 - 01/08/2007', 'endMonth': '', 'endYear': '', 'startDate': ''}, {'skills': '', 'startMonth': '', '...
    1.0 {"type": "opportunity", "customer_code": "", "opportunity_title": "DGFIP - ONEPOINT - Consultant JCL", "opportunity_place": "", "opportunity_expertise_area": "Autres", "opportunity_tools": "", "opportunity_activity_area": "", "opportunity_type": "1", "opportunity_description": "", "opportunity_criteria": "", "opportunity_extract": 1} {"type": "candidate", "customer_code": "", "title": "analyste developpeur pacbase cobol db2", "skills": "cobol, pacbase, db2, cics", "education": "", "experience": "-1", "tools": "", "languages": "", "mobility": "mondeeuropefranceiledefranceparis, mondeeuropefranceiledefranceseineetmarne, mondeeuropefranceiledefranceyvelines, mondeeuropefranceiledefranceessone, mondeeuropefranceiledefrancehautsdeseine92, mondeeuropefranceiledefranceseinesaintdenis, mondeeuropefranceiledefrancevaldemarne, mondeeuropefranceiledefrancevaloise", "expertise_area": "", "activity_area": "", "list_diplomes": "", "typeOf": "0", "source": "", "informationComments": "Sabrina Kadrie\n06 83 65 01 64\nsabrina20@orange.fr", "extract": 1, "experiences": "[]"}
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • log_level: error
  • log_level_replica: passive
  • log_on_each_node: False
  • logging_nan_inf_filter: False
  • bf16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: error
  • log_level_replica: passive
  • log_on_each_node: False
  • logging_nan_inf_filter: False
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0067 500 0.2078 -
0.0134 1000 0.1805 -
0.0202 1500 0.1644 -
0.0269 2000 0.1455 -
0.0336 2500 0.1326 -
0.0403 3000 0.132 0.1514
0.0471 3500 0.1292 -
0.0538 4000 0.1199 -
0.0605 4500 0.1223 -
0.0672 5000 0.1219 -
0.0740 5500 0.1116 -
0.0807 6000 0.1149 0.1483
0.0874 6500 0.1149 -
0.0941 7000 0.1243 -
0.1009 7500 0.1204 -
0.1076 8000 0.1116 -
0.1143 8500 0.109 -
0.1210 9000 0.111 0.1289
0.1278 9500 0.1168 -
0.1345 10000 0.1121 -
0.1412 10500 0.1054 -
0.1479 11000 0.1031 -
0.1547 11500 0.0994 -
0.1614 12000 0.0968 0.1204
0.1681 12500 0.0932 -
0.1748 13000 0.0978 -
0.1816 13500 0.0996 -
0.1883 14000 0.0974 -
0.1950 14500 0.095 -
0.2017 15000 0.0926 0.1139
0.2085 15500 0.0928 -
0.2152 16000 0.1007 -
0.2219 16500 0.0933 -
0.2286 17000 0.0903 -
0.2354 17500 0.0912 -
0.2421 18000 0.0927 0.1124
0.2488 18500 0.0927 -
0.2555 19000 0.1001 -
0.2623 19500 0.0951 -
0.2690 20000 0.0893 -
0.2757 20500 0.0874 -
0.2824 21000 0.0854 0.1100
0.2892 21500 0.0905 -
0.2959 22000 0.0858 -
0.3026 22500 0.0906 -
0.3093 23000 0.0899 -
0.3161 23500 0.0861 -
0.3228 24000 0.0934 0.1063
0.3295 24500 0.0995 -
0.3362 25000 0.0905 -
0.3430 25500 0.0875 -
0.3497 26000 0.074 -
0.3564 26500 0.0875 -
0.3631 27000 0.0821 0.1043
0.3699 27500 0.0877 -
0.3766 28000 0.0837 -
0.3833 28500 0.0854 -
0.3900 29000 0.0754 -
0.3968 29500 0.0803 -
0.4035 30000 0.0872 0.1029
0.4102 30500 0.0829 -
0.4169 31000 0.0841 -
0.4237 31500 0.0861 -
0.4304 32000 0.0827 -
0.4371 32500 0.0867 -
0.4438 33000 0.0808 0.1028
0.4506 33500 0.081 -
0.4573 34000 0.0789 -
0.4640 34500 0.0774 -
0.4707 35000 0.084 -
0.4775 35500 0.0866 -
0.4842 36000 0.0839 0.1010
0.4909 36500 0.0849 -
0.4976 37000 0.0834 -
0.5044 37500 0.0832 -
0.5111 38000 0.0739 -
0.5178 38500 0.077 -
0.5245 39000 0.0799 0.1016
0.5313 39500 0.0775 -
0.5380 40000 0.0788 -
0.5447 40500 0.0821 -
0.5514 41000 0.0796 -
0.5582 41500 0.0795 -
0.5649 42000 0.0836 0.0976
0.5716 42500 0.0783 -
0.5783 43000 0.082 -
0.5851 43500 0.0788 -
0.5918 44000 0.0849 -
0.5985 44500 0.0754 -
0.6052 45000 0.0764 0.0989
0.6120 45500 0.0736 -
0.6187 46000 0.0805 -
0.6254 46500 0.0788 -
0.6321 47000 0.0724 -
0.6389 47500 0.0833 -
0.6456 48000 0.0752 0.0972
0.6523 48500 0.0733 -
0.6590 49000 0.0686 -
0.6658 49500 0.0802 -
0.6725 50000 0.0817 -
0.6792 50500 0.0772 -
0.6859 51000 0.0746 0.0958
0.6927 51500 0.0742 -
0.6994 52000 0.0732 -
0.7061 52500 0.0711 -
0.7128 53000 0.0773 -
0.7196 53500 0.0782 -
0.7263 54000 0.0774 0.0953
0.7330 54500 0.0788 -
0.7397 55000 0.0667 -
0.7465 55500 0.0721 -
0.7532 56000 0.074 -
0.7599 56500 0.0698 -
0.7666 57000 0.0703 0.0948
0.7734 57500 0.0718 -
0.7801 58000 0.0764 -
0.7868 58500 0.078 -
0.7935 59000 0.0784 -
0.8003 59500 0.0771 -
0.8070 60000 0.0766 0.0937
0.8137 60500 0.0758 -
0.8204 61000 0.0747 -
0.8272 61500 0.0814 -
0.8339 62000 0.0719 -
0.8406 62500 0.067 -
0.8473 63000 0.0717 0.0937
0.8541 63500 0.0732 -
0.8608 64000 0.0755 -
0.8675 64500 0.0749 -
0.8742 65000 0.072 -
0.8810 65500 0.071 -
0.8877 66000 0.0702 0.0923
0.8944 66500 0.0676 -
0.9011 67000 0.0753 -
0.9079 67500 0.0734 -
0.9146 68000 0.0654 -
0.9213 68500 0.073 -
0.9280 69000 0.0703 0.0922
0.9348 69500 0.07 -
0.9415 70000 0.0716 -
0.9482 70500 0.0811 -
0.9549 71000 0.0722 -
0.9617 71500 0.0697 -
0.9684 72000 0.0746 0.0915
0.9751 72500 0.0768 -
0.9818 73000 0.0691 -
0.9886 73500 0.0718 -
0.9953 74000 0.0707 -

Framework Versions

  • Python: 3.10.16
  • Sentence Transformers: 5.1.1
  • Transformers: 4.56.2
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.1.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}