orangearg.argument.miner.processor

Argument processor module.

Module Contents

Functions

_match_list_size(*args)

With an arbitrary number of lists as input, check if they are in the same size.

_aggregate_list_by_another(→ Dict)

Aggregate a list according to elements of another list.

get_argument_topics(→ List[Tuple[int]])

Get argument topics.

get_argument_sentiment(→ List[float])

Get argument sentiment score.

get_argument_coherence(→ List[float])

Get argument coherence.

update_argument_table(→ pandas.DataFrame)

Return a copy of argument dataframe, with new columns of argument topics, sentiments, and coherences.

orangearg.argument.miner.processor._match_list_size(*args: List)[source]

With an arbitrary number of lists as input, check if they are in the same size.

orangearg.argument.miner.processor._aggregate_list_by_another(keys: List, values: List) Dict[source]

Aggregate a list according to elements of another list.

Parameters:
  • keys (List) – The group keys.

  • values (List) – The list to be aggregated.

Returns:

The aggregation result.

Return type:

Dict

orangearg.argument.miner.processor.get_argument_topics(arg_ids: List[int], topics: List[int]) List[Tuple[int]][source]

Get argument topics.

The topics of an argument is a combination of the topics of all chunks that belong to this argument. Duplications are not removed, and the reason behind is that duplications can be treated as a sign of topic importance. Also, even though two chunks can belong to the same topic, they could still have different ranks within an argument.

Parameters:
  • arg_ids (List[int]) – the argument ids of chunks.

  • topics (List[int]) – the topic indices of chunks.

Returns:

list of argument topics, which is also a list containing topic indices of chunks belonging to this argument.

Return type:

List[list[int]]

orangearg.argument.miner.processor.get_argument_sentiment(arg_ids: List[int], ranks: List[float], p_scores: List[float], min_sent: int = -1, max_sent: int = 1) List[float][source]

Get argument sentiment score.

The sentiment score of an argument is calculated as a weighted sum of sentiment scores of chunks belonging to this argument, where weights are ranks of the chunks. The result score is then normalized into range [0, 1].

Parameters:
  • arg_ids (List[int]) – the argument ids of chunks.

  • ranks (List[float]) – the pagerank of chunks within arguments.

  • p_scores (List[float]) – the sentiment polarity scores of chunks.

  • min_sent (int) – minimun of argument sentiment before normalization. Defaults to -1.

  • max_sent (int) – maximum of argument sentiment before normalization. Defaults to 1.

Returns:

List of argument sentiment scores, which are floats in range [0, 1].

Return type:

List[float]

orangearg.argument.miner.processor.get_argument_coherence(scores: List[int], sentiments: List[float], min_score: int = 1, max_score: int = 5, variance: float = 0.2) List[float][source]

Get argument coherence.

Coherence is computed as inversed difference between sentiments and overall scores. Overall scores are first normalized into the same range as argument sentiments, which is [0, 1]. Then their differences are computed and applied a Gaussian kernal to invert and scale the differences to [0, 1].

Parameters:
  • scores (List[int]) – List of argument overall scores.

  • sentiments (List[float]) – List of argument sentiment scores.

  • min_score (int, optional) – Lower bound of scores. Defaults to 1.

  • max_score (int, optional) – Upper bound of scores. Defaults to 5.

  • variance (float) – variance of the Gaussian kernal.

Returns:

List of argument coherence scores, in range of (0, 1]

Return type:

List[float]

orangearg.argument.miner.processor.update_argument_table(df_arguments: pandas.DataFrame, topics: List[List[int]], sentiments: List[float], coherences: List[float]) pandas.DataFrame[source]

Return a copy of argument dataframe, with new columns of argument topics, sentiments, and coherences.

Parameters:
  • df_arguments (pd.DataFrame) – argument dataframe.

  • topics (List[List[int]]) – list of argument topics

  • sentiments (List[float]) – list of argument sentiment scores

  • coherences (List[float]) – list of argument coherence scores

Returns:

_description_

Return type:

pd.DataFrame