`orangearg.argument.miner.processor`

Argument processor module.

Module Contents

Functions

`_match_list_size`(*args)	With an arbitrary number of lists as input, check if they are in the same size.
`_aggregate_list_by_another`(→ Dict)	Aggregate a list according to elements of another list.
`get_argument_topics`(→ List[Tuple[int]])	Get argument topics.
`get_argument_sentiment`(→ List[float])	Get argument sentiment score.
`get_argument_coherence`(→ List[float])	Get argument coherence.
`update_argument_table`(→ pandas.DataFrame)	Return a copy of argument dataframe, with new columns of argument topics, sentiments, and coherences.

orangearg.argument.miner.processor._match_list_size(*args: List)[source]: With an arbitrary number of lists as input, check if they are in the same size.

orangearg.argument.miner.processor._aggregate_list_by_another(keys: List, values: List) → Dict[source]

Aggregate a list according to elements of another list.

Parameters:

keys (List) – The group keys.
values (List) – The list to be aggregated.

Returns:

The aggregation result.

Return type:

Dict

orangearg.argument.miner.processor.get_argument_topics(arg_ids: List[int], topics: List[int]) → List[Tuple[int]][source]

Get argument topics.

The topics of an argument is a combination of the topics of all chunks that belong to this argument. Duplications are not removed, and the reason behind is that duplications can be treated as a sign of topic importance. Also, even though two chunks can belong to the same topic, they could still have different ranks within an argument.

Parameters:

arg_ids (List[int]) – the argument ids of chunks.
topics (List[int]) – the topic indices of chunks.

Returns:

list of argument topics, which is also a list containing topic indices of chunks belonging to this argument.

Return type:

List[list[int]]

orangearg.argument.miner.processor.get_argument_sentiment(arg_ids: List[int], ranks: List[float], p_scores: List[float], min_sent: int = -1, max_sent: int = 1) → List[float][source]

Get argument sentiment score.

The sentiment score of an argument is calculated as a weighted sum of sentiment scores of chunks belonging to this argument, where weights are ranks of the chunks. The result score is then normalized into range [0, 1].

Parameters:

arg_ids (List[int]) – the argument ids of chunks.
ranks (List[float]) – the pagerank of chunks within arguments.
p_scores (List[float]) – the sentiment polarity scores of chunks.
min_sent (int) – minimun of argument sentiment before normalization. Defaults to -1.
max_sent (int) – maximum of argument sentiment before normalization. Defaults to 1.

Returns:

List of argument sentiment scores, which are floats in range [0, 1].

Return type:

List[float]

orangearg.argument.miner.processor.get_argument_coherence(scores: List[int], sentiments: List[float], min_score: int = 1, max_score: int = 5, variance: float = 0.2) → List[float][source]

Get argument coherence.

Coherence is computed as inversed difference between sentiments and overall scores. Overall scores are first normalized into the same range as argument sentiments, which is [0, 1]. Then their differences are computed and applied a Gaussian kernal to invert and scale the differences to [0, 1].

Parameters:

scores (List[int]) – List of argument overall scores.
sentiments (List[float]) – List of argument sentiment scores.
min_score (int, optional) – Lower bound of scores. Defaults to 1.
max_score (int, optional) – Upper bound of scores. Defaults to 5.
variance (float) – variance of the Gaussian kernal.

Returns:

List of argument coherence scores, in range of (0, 1]

Return type:

List[float]

orangearg.argument.miner.processor.update_argument_table(df_arguments: pandas.DataFrame, topics: List[List[int]], sentiments: List[float], coherences: List[float]) → pandas.DataFrame[source]

Return a copy of argument dataframe, with new columns of argument topics, sentiments, and coherences.

Parameters:

df_arguments (pd.DataFrame) – argument dataframe.
topics (List[List[int]]) – list of argument topics
sentiments (List[float]) – list of argument sentiment scores
coherences (List[float]) – list of argument coherence scores

Returns:

_description_

Return type:

pd.DataFrame

orangearg.argument.miner.processor

Module Contents

Functions

`orangearg.argument.miner.processor`